2. HPSC MOOC¶
These projects are based on the course “High Performance Scientific Computing” by Professor Randall LeVeque.
This page contains the following codes:
- Workspace 1: Testing Scripts and Dummy Code
test1.py (python script to test for environment variables and python version)
test2.sh (shell script to test for environment variables, gfortran and ipython version)
test3.f90 (a dummy fortran code)
- Workspace 2: Quadratic, Cubic and Polynomial Interpolation
hw2a.py (python script for quadratic interpolation)
hw2b.py (python script for quadratic, cubic and polynomial interpolation)
- Workspace 3: Newton’s Method and Convergence Criteria
newton.py (ipython module to compute square root of a number using Newton’s method)
intersections.py (ipython script to plot a graph where two functions intersect - uses newton.py)
test1.f90, newton.f90, intersections.f90, functions.f90, Makefile (fortran code for the intersection problem using max iterations for convergence)
problem7/newton.f90, problem7/functions.f90, problem7/test_quartic.f90, problem7/Makefile, problem7/intersections.py (intersection problem using residual error for convergence)
- Workspace 4: OpenMP
quadrature.f90 (module for integrating a function)
test1.f90 (cubic function - uses quadrature module)
test2.f90 (sinusoidal function - uses quadrature module)
quadrature_omp.f90, test2_omp.f90 (same as above but using OpenMP)
- Workspace 5: Load Balancing
Makefile (makefile for the 1D solution)
functions.f90 (sinusoidal function module)
quadrature.f90, test.f90 (trapezoid rule module + test - serial)
quadrature2.f90, test2.f90 (simpson’s rule module + test - serial)
quadrature3.f90, test3.f90 (trapezoid rule module + test - OpenMP)
quad2d/functions.f90, quad2d/quadrature.f90, quad2d/test.f90, quad2d/Makefile (integrate a sinusoid with trapezoid rule in 2D with Open MP and load balancing)
- Workspace 6: MPI
Makefile (makefile for this directory)
copyvalue.f90 (passes a value from Process 0 to Processes 1-3 using MPI)
copyvalue2.f90 (passes a value from Process 3 to Processes 0-2 using MPI)
part2/functions.f90, part2/quadrature.f90, part2/Makefile (the sinusoidal function, quadrature module and makefile)
part2/test.f90 (compute the same integral over all Processes using MPI)
part2/test2.f90 (integral using Master-Worker paradigm)
- Project 1: 1D Integral and MPI
functions.f90, quadrature.f90, test2.f90, Makefile (compute a 1D integral using MPI by dividing up intervals between an indeterminate number of processes)
- Project 2: 20D Integral and the Monte Carlo Method
functions.f90, quadrature_mc.f90, random_util.f90, test_quad_mc.f90, Makefile (compute integral using Monte Carlo method for a 20 dimensional integral - serial)
plot_mc_quad_error.py (plot the result)
- Project 3: Laplace’s Equation and the Monte Carlo Method
laplace_mc.py (python version)
problem_description.f90, laplace_mc.f90, mc_walk.f90, random_util.f90, Makefile (compute the solution to Laplace’s Equation using the Monte Carlo Method - serial)
test1.f90, test2.f90 (trials I made along the way)
plot_mc_quad_error.py (plot the result)
- Project 4: Laplace’s Equation, MPI and the Monte Carlo Method
problem_description.f90, laplace_mc.f90, mc_walk.f90, random_util.f90, Makefile (compute the solution to Laplace’s Equation using the Monte Carlo Method - using MPI)
test1.f90, test1b.f90, test2.f90 (trials I made along the way)
plot_mc_quad_error.py (plot the result)
2.1. Workspace 1: Testing Scripts and Dummy Code¶
test1.py (python script to test for environment variables and python version)
"""
test1.py file
Note: this is a Python script. What you are reading now in the triple
quotes is a comment (a "docstring" since it's the first thing in this file).
This file is to test whether you have environment variables set properly and
are able to import numpy (for numerical computation), matplotlib,
and pylab (for plotting).
Instructions:
Make sure your environment variables $UWHPSC and $MYHPSC are set properly.
Insert your name where instructed below.
Type
$ python test1.py
at the Unix prompt ($ represents the prompt) to run the code.
If the output looks ok, change the file to set debug_mode = False
below and run it again. This time the results will go into a file
test1output.txt. Check that these look ok.
Commit the final version of this file along with test1output.txt to your git
repository.
"""
import os,sys
debug_mode = False
if debug_mode:
output_file = sys.stdout # send output to screen
else:
# Once it's working, send output to a file rather than screen:
output_file = open("test1output.txt","w")
sys.stdout = output_file
print "Code run by **Andrew Roberts**"
UWHPSC = os.environ.get('UWHPSC','** not set **')
print "Environment variable UWHPSC is %s" % UWHPSC
MYHPSC = os.environ.get('MYHPSC','** not set **')
print "Environment variable MYHPSC is %s" % MYHPSC
# Check for various Python modules and print an error message if
# an exception occurs:
try:
import numpy
print "Imported numpy ok"
except:
print "** Error importing numpy"
try:
import matplotlib
print "Imported matplotlib ok"
except:
print "** Error importing matplotlib"
try:
import pylab
print "Imported pylab ok"
except:
print "** Error importing pylab"
if not debug_mode:
output_file.close() # close the text file
test2.sh (shell script to test for environment variables, gfortran and ipython version)
# bash script to check for software
# Insert your name where indicated below, then type
# $ bash test2.sh
# When it looks good, redirect output to a file:
# $ bash test2h.sh > test2output.txt
echo
echo Code run by **Andrew Roberts**
echo Environment variable UWHPSC is $UWHPSC
echo Environment variable MYHPSC is $MYHPSC
echo
echo which ipython returns...
which ipython
echo
echo which gfortran returns...
which gfortran
echo
echo gfortran --version returns...
gfortran --version
echo
echo Compiling and running a Fortran code...
gfortran test3.f90
./a.out
test3.f90 (a dummy fortran code)
! Test fortran program.
! Insert your name where indicated below.
! This code will be compiled and run when you execute the test2.sh script.
program test
print *, "Code run by **Andrew Roberts**"
print *, "Successfully ran Fortran 90 program"
end program test
2.2. Workspace 2: Quadratic, Cubic and Polynomial Interpolation¶
hw2a.py (python script for quadratic interpolation)
"""
Demonstration script for quadratic interpolation.
Update this docstring to describe your code.
Modified by: ** your name here **
"""
import numpy as np
import matplotlib.pyplot as plt
from numpy.linalg import solve
# Set up linear system to interpolate through data points:
# Data points:
xi = np.array([-1., 1., 2])
yi = np.array([0., 4., 3.])
# It would be better to define A in terms of the xi points.
# Doing this is part of the homework assignment.
A = np.array([[1., -1., 1.], [1., 1., 1.], [1., 2., 4.]])
b = yi
# Solve the system:
c = solve(A,b)
print "The polynomial coefficients are:"
print c
# Plot the resulting polynomial:
x = np.linspace(-2,3,1001) # points to evaluate polynomial
y = c[0] + c[1]*x + c[2]*x**2
plt.figure(1) # open plot figure window
plt.clf() # clear figure
plt.plot(x,y,'b-') # connect points with a blue line
# Add data points (polynomial should go through these points!)
plt.plot(xi,yi,'ro') # plot as red circles
plt.ylim(-2,8) # set limits in y for plot
plt.title("Data points and interpolating polynomial")
plt.savefig('demo1plot.png') # save figure as .png file
hw2b.py (python script for quadratic, cubic and polynomial interpolation)
"""
Demonstration module for quadratic interpolation.
Update this docstring to describe your code.
Modified by: ** Andrew Roberts **
"""
import numpy as np
import matplotlib.pyplot as plt
from numpy.linalg import solve
def quad_interp(xi,yi):
"""
Quadratic interpolation. Compute the coefficients of the polynomial
interpolating the points (xi[i],yi[i]) for i = 0,1,2.
Returns c, an array containing the coefficients of
p(x) = c[0] + c[1]*x + c[2]*x**2.
"""
# check inputs and print error message if not valid:
error_message = "xi and yi should have type numpy.ndarray"
assert (type(xi) is np.ndarray) and (type(yi) is np.ndarray), error_message
error_message = "xi and yi should have length 3"
assert len(xi)==3 and len(yi)==3, error_message
# Set up linear system to interpolate through data points:
A = np.vstack([np.ones(3), xi, xi**2]).T
b = yi
c = solve(A,b)
return c
def cubic_interp(xi,yi):
"""
Cubic interpolation. Compute the coefficients of the polynomial
interpolating the points (xi[i],yi[i]) for i = 0,1,2,3.
Returns c, an array containing the coefficients of
p(x) = c[0] + c[1]*x + c[2]*x**2 +c[3]x**3.
"""
# check inputs and print error message if not valid:
error_message = "xi and yi should have type numpy.ndarray"
assert (type(xi) is np.ndarray) and (type(yi) is np.ndarray), error_message
error_message = "xi and yi should have length 4"
assert len(xi)==4 and len(yi)==4, error_message
# Set up linear system to interpolate through data points:
A = np.vstack([np.ones(4), xi, xi**2, xi**3]).T
b = yi
c = solve(A,b)
return c
def poly_interp(xi,yi):
"""Polynomial interpolation. Compute the coefficients of the polynomial
interpolating the points (xi[i],yi[i]) for i = 0,1,2... n
Returns c, an array containing the coefficients of
p(x) = c[0] + c[1]*x + c[2]*x**2 + c[3]x**3 +...+ c[n]x**n.
"""
# check inputs and print error message if not valid:
error_message = "xi and yi should have type numpy.ndarray"
assert (type(xi) is np.ndarray) and (type(yi) is np.ndarray), error_message
error_message = "xi and yi should have the same length"
assert len(xi)== len(yi), error_message
# Set up linear system to interpolate through data points:
#stacks them vertically and then takes transpose
#A = np.empty([len(xi),len(xi)])
#A[:,0] = 1.
#for i in range(0,len(xi),1):
# for j in range(1,len(xi),1):
# A[i,j]=xi[i]**(j)
# Uses a list comprehension:
n = len(xi)
A = np.vstack([xi**j for j in range(n)]).T
b = yi
c = solve(A,b)
return c
def plot_quad(xi,yi):
c = quad_interp(xi,yi)
x = np.linspace(xi.min()-1, xi.max() + 1, 1000)
y = c[0]+c[1]*x + c[2]*x**2
plt.figure(1)
plt.clf()
plt.plot(x,y,'b-')
plt.plot(xi,yi,'ro')
plt.xlabel('x (-)')
plt.ylabel('y (-)')
plt.title("Data points and interpolated polynomial")
plt.savefig('quadratic.png')
def plot_cubic(xi,yi):
c = cubic_interp(xi,yi)
x = np.linspace(xi.min()-1, xi.max() + 1, 1000)
y = c[0]+c[1]*x + c[2]*x**2 + c[3]*x**3
plt.figure(1)
plt.clf()
plt.plot(x,y,'b-')
plt.plot(xi,yi,'ro')
plt.xlabel('x (-)')
plt.ylabel('y (-)')
plt.title("Data points and interpolated polynomial")
plt.savefig('cubic.png')
def plot_poly(xi,yi):
c = poly_interp(xi,yi)
x = np.linspace(xi.min()-1, xi.max() + 1, 1000)
n = len(c)
y = c[n-1]
for j in range(n-1, 0, -1):
y = y*x + c[j-1]
plt.figure(1)
plt.clf()
plt.plot(x,y,'b-')
plt.plot(xi,yi,'ro')
plt.xlabel('x (-)')
plt.ylabel('y (-)')
plt.title("Data points and interpolated polynomial")
plt.savefig('poly.png')
def test_quad1():
"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-1., 0., 2.])
yi = np.array([ 1., -1., 7.])
c = quad_interp(xi,yi)
c_true = np.array([-1., 0., 2.])
print "c = ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)
def test_quad2():
"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-10., 1., 20.])
yi = np.array([ 15., -50., 70.])
c = quad_interp(xi,yi)
c_true = np.array([-48.16586922, -2.24162679, 0.40749601])
print "c = ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)
def test_cubic1():
"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-1., 1., 2., 3.])
yi = np.array([5., 2., 3., 5])
c = cubic_interp(xi,yi)
c_true = np.array([ 2.5, -1.41666667, 1., -0.08333333])
print "c = ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)
def test_poly1():
"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-1., 1., 2., 3.])
yi = np.array([5., 2., 3., 5])
c = cubic_interp(xi,yi)
c_true = np.array([ 2.5, -1.41666667, 1., -0.08333333])
print "c = ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)
def test_poly2():
"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-1., 0., 2., 5., 8.])
yi = np.array([ 1., -1., 7., 15., 25.])
c = poly_interp(xi,yi)
c_true = np.array([-1., 1.22777778, 2.51944444, -0.66111111, 0.04722222])
print "c = ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)
if __name__=="__main__":
# "main program"
# the code below is executed only if the module is executed at the command line,
# $ python demo2.py
# or run from within Python, e.g. in IPython with
# In[ ]: run demo2
# not if the module is imported.
print "Running test..."
test_quad1()
2.3. Workspace 3: Newton’s Method and Convergence Criteria¶
newton.py (ipython module to compute square root of a number using Newton’s method)
"""
Module to compute the square root of a number
using Newton's method: f(x) = x**2 - n = 0
Parameters:
max_iter :- maximum number of iterations for iterative method
tol :- relative residual tolerance for iterative method
"""
import numpy as np
max_iter = 20
tol = 1.0e-14
def solve(x0, debug=False):
"""
Compute the solution to the function using the initial guess & allow debugging
Inputs:
fvals_sqrt :- function returning tuple containing values of f(x) and f'(x)
x0 :- initial guess of the square root
debug :- prints x0 outside loop and x within loop, default = False
Outputs:
x :- computed value of the square root
iters :- the number of iterations taken
Example usage:
import newton; solve(4., True)
"""
iters = 0
x = x0
if(debug):
print "Initial guess: x = %20.15e" %x0
for i in range(max_iter):
(fx,dfx) = fvals_sqrt(x)
if(abs(fx)<tol):
break
x = x - (fx/dfx)
iters = iters + 1
if(debug):
print "At iteration %d x = %20.15e" %(iters,x)
return (x, iters)
def fvals_sqrt(x):
"""
Return a tuple containing the value of the function & it's derivative at x
Inputs:
x :- value for which f(x) and f'(x) are to be returned
Outputs:
f :- the value of the function, f(x)
fp :- the derivative of the function, f'(x)
Example usage:
(f, fp) = fvals_sqrt(10)
"""
#f = x**2 - 4.0
#fp = 2.0*x
f = x*np.cos(np.pi*x)-(1-(0.6*x**2))
fp = (-x*np.pi)*np.sin(np.pi*x)+np.cos(np.pi*x)+(1.2*x)
return (f, fp)
def test1(debug_solve=False):
"""
Test Newton's method using an array of initial gusses
Inputs:
None
Outputs:
Print x, iters, and f(x) to the screen
Example usage:
import newton; newton.test1()
"""
#for x in [1., 2., 100.]:
for x in [-2.2, -1.6, -0.8, 1.4]:
print ""
(xout, iters) = solve(x, debug=debug_solve)
print "After %d iterations, the solver returns x = %20.15e" %(iters,xout)
(f, fp) = fvals_sqrt(xout)
print "The function f(x) = %20.15e" %f
#assert abs(xout-2.0) < tol, "***Unexpected result: x = %22.15e" %xout
intersections.py (ipython script to plot a graph where two functions intersect - uses newton.py)
"""
Script to plot g1(x) and g2(x) versus x
and to show where they intersect with black dots
Example use:
>>> run intersections
"""
import newton
import matplotlib.pyplot as plt
import numpy as np
guesses = np.array([-2.2, -1.6, -0.8, 1.4])
xarray = np.zeros(4)
yarray = np.zeros(4)
i = 0
for xin in guesses:
(xout, iters) = newton.solve(xin)
xarray[i] = xout
yarray[i] = (1-(0.6*xout**2))
i = i+1
x = np.linspace(-5,5,1000)
g1 = x*np.cos(np.pi*x)
g2 = (1-(0.6*x**2))
plt.figure(1)
plt.clf()
plt.xlim(-5,5)
plt.xlabel('x (-)')
plt.ylabel('y (-)')
plt.title("Two functions and their intersecting points")
#p1 = plt.plot(x, g1, 'b', label=r'$y(x) = xcos(\pix)$')
p1 = plt.plot(x, g1, 'b', label=r'$y(x) = xcos(\pi x)$')
p2 = plt.plot(x, g2, 'r', label=r'$y(x) = 1-(0.6x^2)$')
p3 = plt.plot(xarray, yarray, 'ko', label=r'$intersection$')
plt.legend(loc="best")
#plt.legend([p1, p2, p3],["g1(x) = xcos(pi*x)", "g2(x) = 1-(0.6x**2)", "intersection"])
plt.show() #show the plot on the screen
plt.savefig('intersection.png') # save figure as .png file
test1.f90, newton.f90, intersections.f90, functions.f90, Makefile (fortran code for the intersection problem using max iterations for convergence)
! $MYHPSC/homework3/test1.f90
program test1
use newton, only: solve
use functions, only: f_sqrt, fprime_sqrt
implicit none
real(kind=8) :: x, x0, fx
real(kind=8) :: x0vals(3)
integer :: iters, i
logical :: debug ! set to .true. or .false.
print *, "Test routine for computing zero of f"
debug = .true.
! values to test as x0:
x0vals = (/1.d0, 2.d0, 100.d0 /)
do i=1,3
x0 = x0vals(i)
print *, ' ' ! blank line
call solve(f_sqrt, fprime_sqrt, x0, x, iters, debug)
print 11, x, iters
11 format('solver returns x = ', es22.15, ' after', i3, ' iterations')
fx = f_sqrt(x)
print 12, fx
12 format('the value of f(x) is ', es22.15)
if (abs(x-2.d0) > 1d-14) then
print 13, x
13 format('*** Unexpected result: x = ', es22.15)
endif
enddo
end program test1
! $MYHPSC/homework3/newton.f90
module newton
! module parameters:
implicit none
integer, parameter :: maxiter = 20
real(kind=8), parameter :: tol = 1.d-14
contains
subroutine solve(f, fp, x0, x, iters, debug)
! Estimate the zero of f(x) using Newton's method.
! Input:
! f: the function to find a root of
! fp: function returning the derivative f'
! x0: the initial guess
! debug: logical, prints iterations if debug=.true.
! Returns:
! the estimate x satisfying f(x)=0 (assumes Newton converged!)
! the number of iterations iters
implicit none
real(kind=8), intent(in) :: x0
real(kind=8), external :: f, fp
logical, intent(in) :: debug
real(kind=8), intent(out) :: x
integer, intent(out) :: iters
! Declare any local variables:
real(kind=8) :: deltax, fx, fxprime
integer :: k
! initial guess
x = x0
if (debug) then
print 11, x
11 format('Initial guess: x = ', es22.15)
endif
! Newton iteration to find a zero of f(x)
do k=1,maxiter
! evaluate function and its derivative:
fx = f(x)
fxprime = fp(x)
if (abs(fx) < tol) then
exit ! jump out of do loop
endif
! compute Newton increment x:
deltax = fx/fxprime
! update x:
x = x - deltax
if (debug) then
print 12, k,x
12 format('After', i3, ' iterations, x = ', es22.15)
endif
enddo
if (k > maxiter) then
! might not have converged
fx = f(x)
if (abs(fx) > tol) then
print *, '*** Warning: has not yet converged'
endif
endif
! number of iterations taken:
iters = k-1
end subroutine solve
end module newton
! $MYHPSC/homework3/intersections.f90
program intersections
use newton, only: solve
use functions, only: f_function, fprime_function
implicit none
real(kind=8) :: x, x0, fx
real(kind=8) :: x0vals(4)
integer :: iters, i
logical :: debug ! set to .true. or .false.
print *, "Test routine for computing zero of f"
debug = .true.
! values to test as x0:
x0vals = (/-2.2d0, -1.6d0, -0.8d0, 1.4d0 /)
do i=1,4
x0 = x0vals(i)
print *, ' ' ! blank line
call solve(f_function, fprime_function, x0, x, iters, debug)
print 11, x, iters
11 format('solver returns x = ', es22.15, ' after', i3, ' iterations')
fx = f_function(x)
print 12, fx
12 format('the value of f(x) is ', es22.15)
! if (abs(x-2.d0) > 1d-14) then
! print 13, x
!13 format('*** Unexpected result: x = ', es22.15)
! endif
enddo
end program intersections
! $MYHPSC/homework3/functions.f90
module functions
! parameters for module
! implicit none
! real(kind=8) :: pi
! save
contains
real(kind=8) function f_sqrt(x)
implicit none
real(kind=8), intent(in) :: x
f_sqrt = x**2 - 4.d0
end function f_sqrt
real(kind=8) function fprime_sqrt(x)
implicit none
real(kind=8), intent(in) :: x
fprime_sqrt = 2.d0 * x
end function fprime_sqrt
real(kind=8) function f_function(x)
implicit none
real(kind=8), intent(in) :: x
real(kind=8) :: pi = 3.141592653589793
f_function = (x*cos(pi*x))-(1.0d0-(0.6d0*(x**2)))
end function f_function
real(kind=8) function fprime_function(x)
implicit none
real(kind=8), intent(in) :: x
real(kind=8) :: pi = 3.141592653589793
fprime_function = ((-x*pi)*sin(pi*x))+cos(pi*x)+(1.2d0*x)
end function fprime_function
end module functions
# $MYHPSC/homework3/Makefile
# Example usage:
# $ make test1
# $ make clean
# $ make intersections
OBJECTS = functions.o newton.o test1.o
OBJECTS2 = functions.o newton.o intersections.o
MODULES = functions.mod newton.mod
FFLAGS = -g
.PHONY: test1 clean intersections
test1: test1.exe
./test1.exe
intersections: intersections.exe
./intersections.exe
test1.exe: $(MODULES) $(OBJECTS)
gfortran $(FFLAGS) $(OBJECTS) -o test1.exe
intersections.exe: $(MODULES) $(OBJECTS2)
gfortran $(FFLAGS) $(OBJECTS2) -o intersections.exe
%.o : %.f90
gfortran $(FFLAGS) -c $<
%.mod: %.f90
gfortran $(FFLAGS) -c $<
clean:
rm -f *.o *.exe *.mod
problem7/newton.f90, problem7/functions.f90, problem7/test_quartic.f90, problem7/Makefile, problem7/intersections.py (intersection problem using residual error for convergence)
! $MYHPSC/homework3/problem7/newton.f90
module newton
! module parameters:
implicit none
integer, parameter :: maxiter = 40
real(kind=8) :: tol
save
contains
subroutine solve(f, fp, x0, x, iters, debug)
! Estimate the zero of f(x) using Newton's method.
! Input:
! f: the function to find a root of
! fp: function returning the derivative f'
! x0: the initial guess
! debug: logical, prints iterations if debug=.true.
! Returns:
! the estimate x satisfying f(x)=0 (assumes Newton converged!)
! the number of iterations iters
implicit none
real(kind=8), intent(in) :: x0
real(kind=8), external :: f, fp
logical, intent(in) :: debug
real(kind=8), intent(out) :: x
integer, intent(out) :: iters
! Declare any local variables:
real(kind=8) :: deltax, fx, fxprime
integer :: k
! initial guess
x = x0
if (debug) then
print 11, x
11 format('Starting with initial guess: x = ', es22.15)
endif
! Newton iteration to find a zero of f(x)
do k=1,maxiter
! evaluate function and its derivative:
fx = f(x)
fxprime = fp(x)
if (abs(fx/fxprime) < tol) then
exit ! jump out of do loop
endif
! compute Newton increment x:
deltax = fx/fxprime
! update x:
x = x - deltax
if (debug) then
print 12, k,x
12 format('After', i3, ' iterations, x = ', es22.15)
endif
enddo
if (k > maxiter) then
! might not have converged
fx = f(x)
if (abs(fx) > tol) then
print *, '*** Warning: has not yet converged'
endif
endif
! number of iterations taken:
iters = k-1
end subroutine solve
end module newton
! $MYHPSC/homework3/problem7/functions.f90
module functions
! parameters for module
implicit none
real(kind=8) :: alpha
save
contains
real(kind=8) function f_sqrt(x)
implicit none
real(kind=8), intent(in) :: x
f_sqrt = x**2 - 4.d0
end function f_sqrt
real(kind=8) function fprime_sqrt(x)
implicit none
real(kind=8), intent(in) :: x
fprime_sqrt = 2.d0 * x
end function fprime_sqrt
real(kind=8) function f_function(x)
implicit none
real(kind=8), intent(in) :: x
real(kind=8) :: pi = 3.141592653589793
f_function = (x*cos(pi*x))-(1.0d0-(0.6d0*(x**2)))
end function f_function
real(kind=8) function fprime_function(x)
implicit none
real(kind=8), intent(in) :: x
real(kind=8) :: pi = 3.141592653589793
fprime_function = ((-x*pi)*sin(pi*x))+cos(pi*x)+(1.2d0*x)
end function fprime_function
real(kind=8) function f_quartic(x)
implicit none
real(kind=8), intent(in) :: x
f_quartic = (x-1)**4 - alpha
end function f_quartic
real(kind=8) function fprime_quartic(x)
implicit none
real(kind=8), intent(in) :: x
fprime_quartic = 4*(x-1)**3
end function fprime_quartic
end module functions
! $MYHPSC/homework3/problem7/test_quartic.f90
program test_quartic
use newton, only: solve, tol
use functions, only: f_quartic, fprime_quartic, alpha
implicit none
real(kind=8) :: x, x0, fx, dfx, xstar
real(kind=8) :: tolvals(3), alphavals(3)
integer :: iters, i, j
logical :: debug ! set to .true. or .false.
! values to test as x0:
x0 = 4.0d0
print 10, x0
10 format("Starting with initial guess x = ", es22.15)
debug = .false.
! values to use as tolerance:
tolvals = (/1.0d-5, 1.0d-10, 1.0d-14/)
! values to use as alpha:
alphavals = (/1.0d-4, 1.0d-8, 1.0d-12/)
print *, ' ' ! blank line
print *, ' alpha tol iters x f(x)/f`(x) x-xstar'
do i=1,3
do j =1,3
tol = tolvals(j)
alpha = alphavals(i)
xstar = 1+(alpha**0.25)
call solve(f_quartic, fprime_quartic, x0, x, iters, debug)
fx = f_quartic(x)
dfx = fprime_quartic(x)
print 11, alpha, tol, iters, x, fx/dfx, x-xstar
11 format(2es13.3, i4, es24.15, 2es13.3)
enddo
print *, ' ' ! blank line
enddo
end program test_quartic
# $MYHPSC/homework3/problem7/Makefile
# Example usage:
# $ make clean
# $ make test_quartic
OBJECTS = functions.o newton.o test_quartic.o
MODULES = functions.mod newton.mod
FFLAGS = -g
.PHONY: clean test_quartic
test_quartic: test_quartic.exe
./test_quartic.exe
test_quartic.exe: $(MODULES) $(OBJECTS)
gfortran $(FFLAGS) $(OBJECTS) -o test_quartic.exe
%.o : %.f90
gfortran $(FFLAGS) -c $<
%.mod: %.f90
gfortran $(FFLAGS) -c $<
clean:
rm -f *.o *.exe *.mod
"""
Script to plot g1(x) and g2(x) versus x
and to show where they intersect with black dots
Example use:
>>> run intersections
"""
import newton
import matplotlib.pyplot as plt
import numpy as np
guesses = np.array([-4,4])
xarray = np.zeros(2)
yarray = np.zeros(2)
i = 0
for xin in guesses:
(xout, iters) = newton.solve(xin)
xarray[i] = xout
yarray[i] = ((xout-1.0)**4.0)-1.0e-4
i = i+1
x = np.linspace(-5,5,1000)
g1 = ((x-1.0)**4.0)-1.0e-4
g2 = 0*x
plt.figure(1)
plt.clf()
plt.ylim(-5e-4,5e-4)
plt.xlim(0,2)
plt.xlabel('x (-)')
plt.ylabel('y (-)')
plt.title("Two functions and their intersecting points")
#p1 = plt.plot(x, g1, 'b', label=r'$y(x) = xcos(\pix)$')
p1 = plt.plot(x, g1, 'b', label=r'$y(x) = (x - 1)^4 - 1\times 10^{-4}$')
p2 = plt.plot(x, g2, 'r', label=r'$y(x) = 0$')
p3 = plt.plot(xarray, yarray, 'ko', label=r'$intersection$')
plt.legend(loc="best")
plt.show() #show the plot on the screen
plt.savefig('intersection.png') # save figure as .png file
2.4. Workspace 4: OpenMP¶
quadrature.f90 (module for integrating a function)
module quadrature
! Performs integration using quadrature integration
! and creates a table of the error between this and
! the known solution.
contains
real(kind=8) function trapezoid(f,a,b,n)
implicit none
real(kind=8), external :: f
real(kind=8), intent(in) :: a, b
integer, intent(in) :: n
real(kind=8) :: h
integer :: i
real(kind=8), dimension(n) :: xpoints, ypoints
h=(b-a)/(n-1)
! print *, " " ! blank line
! print 12, h
12 format("h: ", es22.14)
xpoints=(/((i*h+a),i=0,n-1)/)
ypoints=(/(f(xpoints(i)),i=1,n)/)
trapezoid = h*sum(ypoints)-0.5*h*(ypoints(1)+ypoints(n))
! do i=1,5
! print 13, xpoints(i), ypoints(i)
!13 format("x,y = " es22.14, es22.14)
! enddo
end function trapezoid
subroutine error_table(f,a,b,nvals,int_true)
implicit none
real(kind=8), external :: f
real(kind=8), intent(in) :: a, b, int_true
integer, dimension(:), intent(in) :: nvals
integer :: n
real(kind=8) :: last_error, int_trap, error, ratio
print *, " n trapezoid error ratio"
last_error = 0.d0
do n=1,size(nvals)
int_trap = trapezoid(f, a, b, nvals(n))
error = abs(int_trap - int_true)
ratio = last_error / error
last_error = error
print 11, nvals(n), int_trap, error, ratio
11 format(i8, es22.14, es13.3, es13.3)
enddo
end subroutine error_table
end module quadrature
test1.f90 (cubic function - uses quadrature module)
! $MYHPSC/homework4/test1.f90
! The purpose is to print a table for the effect of the number
! of integration points on the accuracy of the integration
! Example use:
! $ gfortran quadrature.f90 test1.f90
! $ ./a.out
program test1
! To provide input values to print error_table to the screen
use quadrature, only: trapezoid, error_table
implicit none
real(kind=8) :: a,b,int_true
integer :: nvals(7), i, n
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0
! print 10, int_true
! 10 format("true integral: ", es22.14)
! print *, " " ! blank line
! n=5
! print 11, trapezoid(f,a,b,n)
!11 format("calculated integral: ", es22.14)
! values of n to test:
do i=1,7
nvals(i) = 5 * 2**(i-1)
enddo
call error_table(f, a, b, nvals, int_true)
contains
! To return the the value of the function to integrate, f
real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x
f = 1.d0 + x**3
end function f
end program test1
test2.f90 (sinusoidal function - uses quadrature module)
! $MYHPSC/homework4/test2.f90
! The purpose is to print a table for the effect of the number
! of integration points on the accuracy of the integration
! Example use:
! $ gfortran quadrature.f90 test2.f90
! $ ./a.out
program test2
! To provide input values to print error_table to the screen
use quadrature, only: trapezoid, error_table
implicit none
real(kind=8) :: a,b,int_true,k
integer :: nvals(12), i, n
a = 0.d0
b = 2.d0
k = 1000.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - &
((cos(k*b) - cos(k*a)) / k)
! values of n to test:
do i=1,12
nvals(i) = 5 * 2**(i-1)
enddo
call error_table(f, a, b, nvals, int_true)
contains
! To return the the value of the function to integrate, f
real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x
f = 1.d0 + x**3 + sin(1000*x)
end function f
end program test2
quadrature_omp.f90, test2_omp.f90 (same as above but using OpenMP)
module quadrature_omp
! Performs integration using quadrature integration
! and creates a table of the error between this and
! the known solution.
contains
real(kind=8) function trapezoid(f,a,b,n)
use omp_lib
implicit none
real(kind=8), external :: f
real(kind=8), intent(in) :: a, b
integer, intent(in) :: n
real(kind=8) :: h
integer :: i
real(kind=8), dimension(n) :: xpoints, ypoints
h=(b-a)/(n-1)
!$omp parallel private(i)
!$omp do reduction(+ : trapezoid)
do i=1,n
xpoints(i)=(i-1)*h+a
ypoints(i)=f(xpoints(i))
enddo
trapezoid = h*sum(ypoints)-0.5*h*(ypoints(1)+ypoints(n))
!$omp end parallel
end function trapezoid
subroutine error_table(f,a,b,nvals,int_true)
implicit none
real(kind=8), external :: f
real(kind=8), intent(in) :: a, b, int_true
integer, dimension(:), intent(in) :: nvals
integer :: n
real(kind=8) :: last_error, int_trap, error, ratio
print *, " n trapezoid error ratio"
last_error = 0.d0
do n=1,size(nvals)
int_trap = trapezoid(f, a, b, nvals(n))
error = abs(int_trap - int_true)
ratio = last_error / error
last_error = error
print 11, nvals(n), int_trap, error, ratio
11 format(i8, es22.14, es13.3, es13.3)
enddo
end subroutine error_table
end module quadrature_omp
! $MYHPSC/homework4/test2_omp.f90
! The purpose is to print a table for the effect of the number
! of integration points on the accuracy of the integration
! Example use:
! $ gfortran -fopenmp quadrature_omp.f90 test2_omp.f90
! $ ./a.out
program test2_omp
! To provide input values to print error_table to the screen
use omp_lib
use quadrature_omp, only: trapezoid, error_table
implicit none
real(kind=8) :: a,b,int_true,k
integer :: nvals(20), i, n, nthreads
real(kind=8) :: t1, t2, elapsed_time
integer(kind=8) :: tclock1, tclock2, clock_rate
a = 0.d0
b = 2.d0
k = 1000.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - &
((cos(k*b) - cos(k*a)) / k)
! Specify number of threads to use:
nthreads = 1 ! need this value in serial mode
!$ nthreads = 2
!$ call omp_set_num_threads(nthreads)
!$ print "('Using OpenMP with ',i3,' threads')", nthreads
! Specify number of threads to use:
!$ call omp_set_num_threads(2)
! values of n to test:
do i=1,20
nvals(i) = 5 * 2**(i-1)
enddo
call system_clock(tclock1) ! start wall timer
call cpu_time(t1) ! start cpu timer
call error_table(f, a, b, nvals, int_true)
call cpu_time(t2) ! end cpu timer
print 10, t2-t1
10 format("CPU time = ",es13.3, " seconds")
call system_clock(tclock2, clock_rate)
elapsed_time = float(tclock2 - tclock1) / float(clock_rate)
print 11, elapsed_time
11 format("Elapsed time = ",es13.3, " seconds")
contains
! To return the the value of the function to integrate, f
real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x
f = 1.d0 + x**3 + sin(1000*x)
end function f
end program test2_omp
2.5. Workspace 5: Load Balancing¶
Makefile (makefile for the 1D solution)
OBJECTS = functions.o quadrature.o test.o
OBJECTS2 = functions.o quadrature2.o test2.o
OBJECTS3 = functions.o quadrature3.o test3.o
FFLAGS = -fopenmp
.PHONY: test test2 test3 clean
test: test.exe
./test.exe
test2: test2.exe
./test2.exe
test3: test3.exe
./test3.exe
test.exe: $(OBJECTS)
gfortran $(FFLAGS) $(OBJECTS) -o test.exe
test2.exe: $(OBJECTS2)
gfortran $(FFLAGS) $(OBJECTS2) -o test2.exe
test3.exe: $(OBJECTS3)
gfortran $(FFLAGS) $(OBJECTS3) -o test3.exe
%.o : %.f90
gfortran $(FFLAGS) -c $<
clean:
rm -f *.o *.exe *.mod
functions.f90 (sinusoial function module)
functions.f90
module functions
use omp_lib
implicit none
integer :: fevals(0:7)
real(kind=8) :: k
save
contains
real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x
integer thread_num
! keep track of number of function evaluations by
! each thread:
thread_num = 0 ! serial mode
!$ thread_num = omp_get_thread_num()
fevals(thread_num) = fevals(thread_num) + 1
f = 1.d0 + x**3 + sin(k*x)
end function f
end module functions
quadrature.f90, test.f90 (trapezoid rule module + test - serial)
module quadrature
use omp_lib
contains
real(kind=8) function trapezoid(f, a, b, n)
! Estimate the integral of f(x) from a to b using the
! Trapezoid Rule with n points.
! Input:
! f: the function to integrate
! a: left endpoint
! b: right endpoint
! n: number of points to use
! Returns:
! the estimate of the integral
implicit none
real(kind=8), intent(in) :: a,b
real(kind=8), external :: f
integer, intent(in) :: n
! Local variables:
integer :: j
real(kind=8) :: h, trap_sum, xj
h = (b-a)/(n-1)
trap_sum = 0.5d0*(f(a) + f(b)) ! endpoint contributions
!$omp parallel do private(xj) reduction(+ : trap_sum)
do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(xj)
enddo
trapezoid = h * trap_sum
end function trapezoid
subroutine error_table(f,a,b,nvals,int_true,method)
! Compute and print out a table of errors when the quadrature
! rule specified by the input function method is applied for
! each value of n in the array nvals.
implicit none
real(kind=8), intent(in) :: a,b, int_true
real(kind=8), external :: f, method
integer, dimension(:), intent(in) :: nvals
! Local variables:
integer :: j, n
real(kind=8) :: ratio, last_error, error, int_approx
print *, " n approximation error ratio"
last_error = 0.d0
do j=1,size(nvals)
n = nvals(j)
int_approx = method(f,a,b,n)
error = abs(int_approx - int_true)
ratio = last_error / error
last_error = error ! for next n
print 11, n, int_approx, error, ratio
11 format(i8, es22.14, es13.3, es13.3)
enddo
end subroutine error_table
end module quadrature
program test
use omp_lib
use quadrature, only: trapezoid, error_table
use functions, only: f, fevals, k
implicit none
real(kind=8) :: a,b,int_true
integer :: nvals(12)
integer :: i, nthreads
real(kind=8) :: t1, t2, elapsed_time
integer(kind=8) :: tclock1, tclock2, clock_rate
nthreads = 1 ! for serial mode
!$ nthreads = 4 ! for openmp
!$ call omp_set_num_threads(nthreads)
print 100, nthreads
100 format("Using ",i2," threads")
fevals = 0
k = 1.d3 ! functions module variable for function f2
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
print 10, int_true
10 format("true integral: ", es22.14)
print *, " " ! blank line
! values of n to test: (larger values than before)
do i=1,12
nvals(i) = 50 * 2**(i-1)
enddo
! time the call to error_table:
call system_clock(tclock1)
call cpu_time(t1)
call error_table(f, a, b, nvals, int_true, trapezoid)
call cpu_time(t2)
call system_clock(tclock2, clock_rate)
elapsed_time = float(tclock2 - tclock1) / float(clock_rate)
print *, " "
print 11, elapsed_time
11 format("Elapsed time = ",f12.8, " seconds")
print 12, t2-t1
12 format("CPU time = ",f12.8, " seconds")
! print the number of function evaluations by each thread:
do i=0,nthreads-1
print 101, i, fevals(i)
101 format("fevals by thread ",i2,": ",i13)
enddo
print 102, sum(fevals)
102 format("Total number of fevals: ",i10)
end program test
quadrature2.f90, test2.f90 (simpson’s rule module + test - serial)
module quadrature2
use omp_lib
contains
real(kind=8) function simpson(f, a, b, n)
! Estimate the integral of f(x) from a to b using the
! Simpson Rule with n points.
! Input:
! f: the function to integrate
! a: left endpoint
! b: right endpoint
! n: number of points to use
! Returns:
! the estimate of the integral
implicit none
real(kind=8), intent(in) :: a,b
real(kind=8), external :: f
integer, intent(in) :: n
! Local variables:
integer :: j
real(kind=8) :: h, simp_sum, xj, xj2
h = (b-a)/(n-1)
simp_sum = (1.d0/6.d0)*(f(a) + f(b)) ! endpoint contributions
simp_sum = simp_sum + (2.d0/3.d0)*f(a+0.5d0*h) !for the half before the loop
!$omp parallel do private(xj,xj2) reduction(+ : simp_sum)
do j=2,n-1
xj = a + (j-1.d0)*h
xj2 = a + (j-0.5d0)*h
simp_sum = simp_sum + (1.d0/3.d0)*f(xj) + (2.d0/3.d0)*f(xj2)
enddo
simpson = h * simp_sum
end function simpson
subroutine error_table(f,a,b,nvals,int_true,method)
! Compute and print out a table of errors when the quadrature
! rule specified by the input function method is applied for
! each value of n in the array nvals.
implicit none
real(kind=8), intent(in) :: a,b, int_true
real(kind=8), external :: f, method
integer, dimension(:), intent(in) :: nvals
! Local variables:
integer :: j, n
real(kind=8) :: ratio, last_error, error, int_approx
print *, " n approximation error ratio"
last_error = 0.d0
do j=1,size(nvals)
n = nvals(j)
int_approx = method(f,a,b,n)
error = abs(int_approx - int_true)
ratio = last_error / error
last_error = error ! for next n
print 11, n, int_approx, error, ratio
11 format(i8, es22.14, es13.3, es13.3)
enddo
end subroutine error_table
end module quadrature2
program test2
use omp_lib
use quadrature2, only: simpson, error_table
use functions, only: f, fevals, k
implicit none
real(kind=8) :: a,b,int_true
integer :: nvals(12)
integer :: i, nthreads
real(kind=8) :: t1, t2, elapsed_time
integer(kind=8) :: tclock1, tclock2, clock_rate
nthreads = 1 ! for serial mode
!$ nthreads = 4 ! for openmp
!$ call omp_set_num_threads(nthreads)
print 100, nthreads
100 format("Using ",i2," threads")
fevals = 0
k = 1.d3 ! functions module variable for function f2
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
print 10, int_true
10 format("true integral: ", es22.14)
print *, " " ! blank line
! values of n to test: (larger values than before)
do i=1,12
nvals(i) = 50 * 2**(i-1)
enddo
! time the call to error_table:
call system_clock(tclock1)
call cpu_time(t1)
call error_table(f, a, b, nvals, int_true, simpson)
call cpu_time(t2)
call system_clock(tclock2, clock_rate)
elapsed_time = float(tclock2 - tclock1) / float(clock_rate)
print *, " "
print 11, elapsed_time
11 format("Elapsed time = ",f12.8, " seconds")
print 12, t2-t1
12 format("CPU time = ",f12.8, " seconds")
! print the number of function evaluations by each thread:
do i=0,nthreads-1
print 101, i, fevals(i)
101 format("fevals by thread ",i2,": ",i13)
enddo
print 102, sum(fevals)
102 format("Total number of fevals: ",i10)
end program test2
quadrature3.f90, test3.f90 (trapezoid rule module + test - OpenMP)
module quadrature3
use omp_lib
contains
real(kind=8) function trapezoid(f, a, b, n)
! Estimate the integral of f(x) from a to b using the
! Trapezoid Rule with n points.
! Input:
! f: the function to integrate
! a: left endpoint
! b: right endpoint
! n: number of points to use
! Returns:
! the estimate of the integral
implicit none
real(kind=8), intent(in) :: a,b
real(kind=8), external :: f
integer, intent(in) :: n
! Local variables:
integer :: j
real(kind=8) :: h, trap_sum, xj
h = (b-a)/(n-1)
trap_sum = 0.5d0*(f(a) + f(b)) ! endpoint contributions
do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(xj)
enddo
trapezoid = h * trap_sum
end function trapezoid
subroutine error_table(f,a,b,nvals,int_true,method)
! Compute and print out a table of errors when the quadrature
! rule specified by the input function method is applied for
! each value of n in the array nvals.
implicit none
real(kind=8), intent(in) :: a,b, int_true
real(kind=8), external :: f, method
integer, dimension(:), intent(in) :: nvals
! Local variables:
integer :: j, n
real(kind=8) :: ratio, last_error, error, int_approx
integer thread_num
print *, " n approximation error ratio thread no."
last_error = 0.d0
!$omp parallel do firstprivate(last_error) private(n, int_approx, error, ratio) &
!$omp schedule(dynamic)
do j=size(nvals),1,-1
n = nvals(j)
int_approx = method(f,a,b,n)
error = abs(int_approx - int_true)
ratio = last_error / error
last_error = error ! for next n -- last_error is firstprivate variable -
! each thead must start with last_error as zero - so ratio will be wrong.
thread_num = 0 ! serial mode
!$ thread_num = omp_get_thread_num()
print 11, n, int_approx, error, ratio, thread_num
11 format(i8, es22.14, es13.3, es13.3, i8)
enddo
end subroutine error_table
end module quadrature3
program test3
use omp_lib
use quadrature3, only: trapezoid, error_table
use functions, only: f, fevals, k
implicit none
real(kind=8) :: a,b,int_true
integer :: nvals(12)
integer :: i, nthreads
real(kind=8) :: t1, t2, elapsed_time
integer(kind=8) :: tclock1, tclock2, clock_rate
nthreads = 1 ! for serial mode
!$ nthreads = 4 ! for openmp
!$ call omp_set_num_threads(nthreads)
print 100, nthreads
100 format("Using ",i2," threads")
fevals = 0
k = 1.d3 ! functions module variable for function f2
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
print 10, int_true
10 format("true integral: ", es22.14)
print *, " " ! blank line
! values of n to test: (larger values than before)
do i=1,12
nvals(i) = 50 * 2**(i-1)
enddo
! time the call to error_table:
call system_clock(tclock1)
call cpu_time(t1)
call error_table(f, a, b, nvals, int_true, trapezoid)
call cpu_time(t2)
call system_clock(tclock2, clock_rate)
elapsed_time = float(tclock2 - tclock1) / float(clock_rate)
print *, " "
print 11, elapsed_time
11 format("Elapsed time = ",f12.8, " seconds")
print 12, t2-t1
12 format("CPU time = ",f12.8, " seconds")
! print the number of function evaluations by each thread:
do i=0,nthreads-1
print 101, i, fevals(i)
101 format("fevals by thread ",i2,": ",i13)
enddo
print 102, sum(fevals)
102 format("Total number of fevals: ",i10)
end program test3
quad2d/functions.f90, quad2d/quadrature.f90, quad2d/test.f90, quad2d/Makefile (integrate a sinusoid with trapezoid rule in 2D with Open MP and load balancing)
module functions
use omp_lib
implicit none
integer :: fevals(0:7), gevals(0:7)
real(kind=8) :: k
save
contains
real(kind=8) function f(g, x, c, d)
implicit none
real(kind=8), intent(in) :: c,d
real(kind=8), intent(in) :: x
real(kind=8), external :: g
integer thread_num, n, j
real(kind=8) :: h, trap_sum, y
! keep track of number of function evaluations by
! each thread:
thread_num = 0 ! serial mode
!$ thread_num = omp_get_thread_num()
fevals(thread_num) = fevals(thread_num) + 1
n = 1000
h = (d-c)/(n-1)
trap_sum = 0.5d0*(g(x,c) + g(x,d)) ! endpoint contributions
! !$omp parallel do private(y) reduction(+ : trap_sum)
do j=2,n-1
y = c + (j-1)*h
trap_sum = trap_sum + g(x,y)
enddo
f = h * trap_sum
end function f
real(kind=8) function g(x,y)
implicit none
real(kind=8), intent(in) :: x,y
integer thread_num_g
thread_num_g = 0
!$ thread_num_g = omp_get_thread_num()
gevals(thread_num_g) = gevals(thread_num_g) + 1
g = sin(x+y)
end function g
end module functions
module quadrature
use omp_lib
contains
real(kind=8) function trapezoid(f, g, a, b, c, d, n)
! Estimate the integral of f(x) from a to b using the
! Trapezoid Rule with n points.
! Input:
! f: the function to integrate
! a: left endpoint x
! b: right endpoint x
! c: left endpoint y
! d: right endpoint y
! n: number of points to use
! Returns:
! the estimate of the integral
implicit none
real(kind=8), intent(in) :: a,b,c,d
real(kind=8), external :: f,g
integer, intent(in) :: n
! Local variables:
integer :: j
real(kind=8) :: h, trap_sum, xj
h = (b-a)/(n-1)
trap_sum = 0.5d0*(f(g, a, c, d) + f(g, b, c, d)) ! endpoint contributions
!$omp parallel do private(xj) reduction(+ : trap_sum)
do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(g, xj, c, d)
enddo
trapezoid = h * trap_sum
end function trapezoid
subroutine error_table(f,g,a,b,c,d,nvals,int_true,method)
! Compute and print out a table of errors when the quadrature
! rule specified by the input function method is applied for
! each value of n in the array nvals.
implicit none
real(kind=8), intent(in) :: a,b,c,d, int_true
real(kind=8), external :: f, g, method
integer, dimension(:), intent(in) :: nvals
! Local variables:
integer :: j, n
real(kind=8) :: ratio, last_error, error, int_approx
print *, " n approximation error ratio"
last_error = 0.d0
do j=1,size(nvals)
n = nvals(j)
int_approx = method(f,g,a,b,c,d,n)
error = abs(int_approx - int_true)
ratio = last_error / error
last_error = error ! for next n
print 11, n, int_approx, error, ratio
11 format(i8, es22.14, es13.3, es13.3)
enddo
end subroutine error_table
end module quadrature
program test
use omp_lib
use quadrature, only: trapezoid, error_table
use functions, only: f, g, fevals, gevals, k
implicit none
real(kind=8) :: a,b,c,d,int_true
integer :: nvals(10)
integer :: i, nthreads
real(kind=8) :: t1, t2, elapsed_time
integer(kind=8) :: tclock1, tclock2, clock_rate
nthreads = 1 ! for serial mode
!$ nthreads = 4 ! for openmp
!$ call omp_set_num_threads(nthreads)
print 100, nthreads
100 format("Using ",i2," threads")
fevals = 0
k = 1.d3 ! functions module variable for function f2
a = 0.d0
b = 2.d0
c = 1.d0
d = 4.d0
! int_true = (-sin(b+c)+sin(b+d))-(-sin(a+c)+sin(a+d))
int_true = (-sin(b+d)+sin(b+c))-(-sin(a+d)+sin(a+c))
print 10, int_true
10 format("true integral: ", es22.14)
print *, " " ! blank line
! values of n to test: (larger values than before)
do i=1,10
nvals(i) = 5 * 2**(i-1)
enddo
! time the call to error_table:
call system_clock(tclock1)
call cpu_time(t1)
call error_table(f, g, a, b, c, d, nvals, int_true, trapezoid)
call cpu_time(t2)
call system_clock(tclock2, clock_rate)
elapsed_time = float(tclock2 - tclock1) / float(clock_rate)
print *, " "
print 11, elapsed_time
11 format("Elapsed time = ",f12.8, " seconds")
print 12, t2-t1
12 format("CPU time = ",f12.8, " seconds")
! print the number of function evaluations by each thread:
do i=0,nthreads-1
print 101, i, fevals(i)
101 format("fevals by thread ",i2,": ",i13)
enddo
print 102, sum(fevals)
102 format("Total number of fevals: ",i10)
! print the number of function evaluations by each thread:
do i=0,nthreads-1
print 103, i, gevals(i)
103 format("gevals by thread ",i2,": ",i13)
enddo
print 104, sum(gevals)
104 format("Total number of gevals: ",i10)
end program test
2.6. Workspace 6: MPI¶
Makefile (makefile for this directory)
# $MYHPSC/codes/mpi/Makefile
# Usage: to try out test1.f90, for example:
# make test FNAME=test1
# or
# make test FNAME=test1 NUM_PROCS=8
NUM_PROCS ?= 4 # default if not specified on command line or env variable
FC = mpif90
FFLAGS =
LFLAGS =
.PHONY: clean, test
FNAME ?= test
%.o : %.f90
$(FC) $(FFLAGS) -c $<
$(FNAME).exe: $(FNAME).f90
$(FC) $(FFLAGS) $(LFLAGS) $(FNAME).f90 -o $(FNAME).exe
test: $(FNAME).exe
mpiexec -n $(NUM_PROCS) ./$(FNAME).exe
clean:
rm -f *.o *.exe
copyvalue.f90 (passes a value from Process 0 to Processes 1-3 using MPI)
! $UWHPSC/codes/mpi/copyvalue.f90
!
! Set value in Process 0 and copy this through a chain of processes
! and finally print result from Process numprocs-1.
!
program copyvalue
use mpi
implicit none
integer :: i, proc_num, num_procs,ierr
integer, dimension(MPI_STATUS_SIZE) :: status
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)
if (num_procs==1) then
print *, "Only one process, cannot do anything"
call MPI_FINALIZE(ierr)
stop
endif
if (proc_num==0) then
i = 55
print '("Process ",i3," setting i = ",i3)', proc_num, i
call MPI_SEND(i, 1, MPI_INTEGER, 1, 21, &
MPI_COMM_WORLD, ierr)
else if (proc_num < num_procs - 1) then
call MPI_RECV(i, 1, MPI_INTEGER, proc_num-1, 21, &
MPI_COMM_WORLD, status, ierr)
print '("Process ",i3," receives i = ",i3)', proc_num, i
print '("Process ",i3," sends i = ",i3)', proc_num, i
call MPI_SEND(i, 1, MPI_INTEGER, proc_num+1, 21, &
MPI_COMM_WORLD, ierr)
else if (proc_num == num_procs - 1) then
call MPI_RECV(i, 1, MPI_INTEGER, proc_num-1, 21, &
MPI_COMM_WORLD, status, ierr)
print '("Process ",i3," ends up with i = ",i3)', proc_num, i
endif
call MPI_FINALIZE(ierr)
end program copyvalue
copyvalue2.f90 (passes a value from Process 3 to Processes 0-2 using MPI)
! $UWHPSC/codes/mpi/copyvalue2.f90
!
! Set value in Process numprocs-1 and copy this through a chain of processes
! and finally print result from Process 0.
!
program copyvalue2
use mpi
implicit none
integer :: i, proc_num, num_procs,ierr
integer, dimension(MPI_STATUS_SIZE) :: status
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)
if (num_procs .EQ. 1) then
print *, "Only one process, cannot do anything"
call MPI_FINALIZE(ierr)
stop
endif
if (proc_num .EQ. num_procs-1) then
i = 55
print '("Process ",i3," setting i = ",i3)', proc_num, i
call MPI_SEND(i, 1, MPI_INTEGER, num_procs-2, 21, &
MPI_COMM_WORLD, ierr)
else if (proc_num .GT. 0) then
call MPI_RECV(i, 1, MPI_INTEGER, proc_num+1, 21, &
MPI_COMM_WORLD, status, ierr)
print '("Process ",i3," receives i = ",i3)', proc_num, i
print '("Process ",i3," sends i = ",i3)', proc_num, i-1
call MPI_SEND(i-1, 1, MPI_INTEGER, proc_num-1, 21, &
MPI_COMM_WORLD, ierr)
else if (proc_num .EQ. 0) then
call MPI_RECV(i, 1, MPI_INTEGER, 1, 21, &
MPI_COMM_WORLD, status, ierr)
print '("Process ",i3," ends up with i = ",i3)', proc_num, i
endif
call MPI_FINALIZE(ierr)
end program copyvalue2
part2/functions.f90, part2/quadrature.f90, part2/Makefile (the sinusoidal function, quadrature module and makefile)
module functions
implicit none
integer :: fevals_proc
real(kind=8) :: k
save
contains
real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x
integer proc_num, ierr
! keep track of number of function evaluations by
! each process:
fevals_proc = fevals_proc + 1
f = 1.d0 + x**3 + sin(k*x)
end function f
end module functions
module quadrature
contains
real(kind=8) function trapezoid(f, a, b, n)
! Estimate the integral of f(x) from a to b using the
! Trapezoid Rule with n points.
! Input:
! f: the function to integrate
! a: left endpoint
! b: right endpoint
! n: number of points to use
! Returns:
! the estimate of the integral
implicit none
real(kind=8), intent(in) :: a,b
real(kind=8), external :: f
integer, intent(in) :: n
! Local variables:
integer :: j
real(kind=8) :: h, trap_sum, xj
h = (b-a)/(n-1)
trap_sum = 0.5d0*(f(a) + f(b)) ! endpoint contributions
do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(xj)
enddo
trapezoid = h * trap_sum
end function trapezoid
end module quadrature
OBJECTS = functions.o quadrature.o test.o
OBJECTS2 = functions.o quadrature.o test2.o
FFLAGS =
NUM_PROCS ?= 4 # default if not specified on command line or env variable
.PHONY: test clean test2
test: test.exe
mpiexec -n $(NUM_PROCS) test.exe
test2: test2.exe
mpiexec -n $(NUM_PROCS) test2.exe
test.exe: $(OBJECTS)
mpif90 $(FFLAGS) $(OBJECTS) -o test.exe
test2.exe: $(OBJECTS2)
mpif90 $(FFLAGS) $(OBJECTS2) -o test2.exe
%.o : %.f90
mpif90 $(FFLAGS) -c $<
clean:
rm -f *.o *.exe *.mod
part2/test.f90 (compute the same integral over all Processes using MPI)
program test
use mpi
use quadrature, only: trapezoid
use functions, only: f, fevals_proc, k
implicit none
real(kind=8) :: a,b,int_true, int_approx
integer :: proc_num, num_procs, ierr, n, fevals_total
integer, dimension(MPI_STATUS_SIZE) :: status
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)
! All processes set these values so we don't have to broadcast:
k = 1.d3 ! functions module variable
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
n = 1000
! Each process keeps track of number of fevals:
fevals_proc = 0
if (proc_num==0) then
print '("Using ",i3," processes")', num_procs
print '("true integral: ", es22.14)', int_true
print *, " " ! blank line
endif
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for process 0 to print
! Note: In this version all processes call trap and repeat the
! same work, so each should get the same answer.
int_approx = trapezoid(f,a,b,n)
print '("Process ",i3," with n = ",i8," computes int_approx = ",es22.14)', &
proc_num,n, int_approx
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for all process to print
! print the number of function evaluations by each thread:
print '("fevals by Process ",i2,": ",i13)', proc_num, fevals_proc
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for all process to print
call MPI_REDUCE(fevals_proc, fevals_total, 1, &
MPI_DOUBLE_PRECISION, MPI_SUM, 0, &
MPI_COMM_WORLD, ierr)
if (proc_num==0) then
! This is wrong -- part of homework is to fix this:
! fevals_total = 0 !! need to fix
print '("Total number of fevals: ",i10)', fevals_total
endif
call MPI_FINALIZE(ierr)
end program test
part2/test2.f90 (integral using Master-Worker paradigm)Makefile for this directory
program test2
use mpi
use quadrature, only: trapezoid
use functions, only: f, fevals_proc, k
implicit none
real(kind=8) :: a, b, int_true, int_approx, dx_sub, int_sub
integer :: proc_num, num_procs, ierr, n, fevals_total, n_sub, j, jj
integer, dimension(MPI_STATUS_SIZE) :: status
real(kind=8), allocatable, dimension(:,:) :: ab_sub
real(kind=8), allocatable, dimension(:) :: sub, int_array
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)
! All processes set these values so we don't have to broadcast:
k = 1.d3 ! functions module variable
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
n = 1000
n_sub = num_procs-1 ! Set the number of Worker Processes
! Each process keeps track of number of fevals:
fevals_proc = 0
if (proc_num .GT. 0) then
allocate(sub(2)) ! to hold a column vector sent from master
endif
if (proc_num .EQ. 0) then
print '("Using ",i3," processes")', num_procs
print '("true integral: ", es22.14)', int_true
print *, " " ! blank line
allocate(ab_sub(2,n_sub)) ! to hold a-b values for Workers
allocate(int_array(n_sub)) ! to hold integral from Workers
ab_sub = 1.d0
! Process 0 (Master) sends other processes (Workers) left and right edges
! of the integral
dx_sub = (b-a) / n_sub
do j=1,n_sub
ab_sub(1,j) = a + (j-1)*dx_sub
ab_sub(2,j) = a + j*dx_sub
call MPI_SEND(ab_sub(1,j), 2, MPI_DOUBLE_PRECISION, j, j, &
MPI_COMM_WORLD, ierr)
enddo
! Master recieves part of the integral from the Workers and stores it in
! an array from any process
do j=1,n_sub
call MPI_RECV(int_sub, 1, MPI_DOUBLE_PRECISION, &
MPI_ANY_SOURCE, MPI_ANY_TAG, &
MPI_COMM_WORLD, status, ierr)
jj = status(MPI_TAG)
int_array(jj) = int_sub
enddo
int_approx = sum(abs(int_array))
endif
! Process 0 (Master) sends other processes (Workers) left and right edges
! of the integral
if(proc_num .GT. 0) then
call MPI_RECV(sub, 2, MPI_DOUBLE_PRECISION, &
0, MPI_ANY_TAG, &
MPI_COMM_WORLD, status, ierr)
int_sub = trapezoid(f,sub(1),sub(2),n)
j = status(MPI_TAG) ! this is the row number
! (should agree with proc_num)
call MPI_SEND(int_sub, 1, MPI_DOUBLE_PRECISION, &
0, j, MPI_COMM_WORLD, ierr)
! print the number of function evaluations by each thread:
print '("fevals by Process ",i2,": ",i13)', proc_num, fevals_proc
endif
! Master Process prints out the result:
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for all processes to print
if (proc_num .EQ. 0) then
print '("Trapezoid approximation with ",i8," total points: ",es22.14)',&
n_sub*n, int_approx
endif
call MPI_FINALIZE(ierr)
end program test2
2.7. Project 1: 1D Integral and MPI¶
functions.f90, quadrature.f90, test2.f90, Makefile (compute a 1D integral using MPI by dividing up intervals between an indeterminate number of processes)
OBJECTS = functions.o quadrature.o test.o
OBJECTS2 = functions.o quadrature.o test2.o
OBJECTS3 = functions.o quadrature.o test3.o
FFLAGS =
NUM_PROCS ?= 4 # default if not specified on command line or env variable
.PHONY: test clean test2 test3
test: test.exe
mpiexec -n $(NUM_PROCS) test.exe
test2: test2.exe
mpiexec -n $(NUM_PROCS) test2.exe
test3: test3.exe
mpiexec -n $(NUM_PROCS) test3.exe
test.exe: $(OBJECTS)
mpif90 $(FFLAGS) $(OBJECTS) -o test.exe
test2.exe: $(OBJECTS2)
mpif90 $(FFLAGS) $(OBJECTS2) -o test2.exe
test3.exe: $(OBJECTS3)
mpif90 $(FFLAGS) $(OBJECTS3) -o test3.exe
%.o : %.f90
mpif90 $(FFLAGS) -c $<
clean:
rm -f *.o *.exe *.mod
module functions
implicit none
integer :: fevals_proc
real(kind=8) :: k
save
contains
real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x
integer proc_num, ierr
! keep track of number of function evaluations by
! each process:
fevals_proc = fevals_proc + 1
f = 1.d0 + x**3 + sin(k*x)
end function f
end module functions
module quadrature
contains
real(kind=8) function trapezoid(f, a, b, n)
! Estimate the integral of f(x) from a to b using the
! Trapezoid Rule with n points.
! Input:
! f: the function to integrate
! a: left endpoint
! b: right endpoint
! n: number of points to use
! Returns:
! the estimate of the integral
implicit none
real(kind=8), intent(in) :: a,b
real(kind=8), external :: f
integer, intent(in) :: n
! Local variables:
integer :: j
real(kind=8) :: h, trap_sum, xj
h = (b-a)/(n-1)
trap_sum = 0.5d0*(f(a) + f(b)) ! endpoint contributions
do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(xj)
enddo
trapezoid = h * trap_sum
end function trapezoid
end module quadrature
program test2
use mpi
use quadrature, only: trapezoid
use functions, only: f, fevals_proc, k
implicit none
real(kind=8) :: a, b, int_true, int_approx, dx_sub, int_sub
integer :: proc_num, num_procs, ierr, n, fevals_total, n_sub, j, jj
integer, dimension(MPI_STATUS_SIZE) :: status
real(kind=8), allocatable, dimension(:,:) :: ab_sub
real(kind=8), allocatable, dimension(:) :: sub, int_array
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)
! All processes set these values so we don't have to broadcast:
k = 1.d3 ! functions module variable
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
n = 1000
n_sub = num_procs-1 ! Set the number of Worker Processes
! Each process keeps track of number of fevals:
fevals_proc = 0
if (proc_num .GT. 0) then
allocate(sub(2)) ! to hold a column vector sent from master
endif
if (proc_num .EQ. 0) then
print '("Using ",i3," processes")', num_procs
print '("true integral: ", es22.14)', int_true
print *, " " ! blank line
allocate(ab_sub(2,n_sub)) ! to hold a-b values for Workers
allocate(int_array(n_sub)) ! to hold integral from Workers
ab_sub = 1.d0
! Process 0 (Master) sends other processes (Workers) left and right edges
! of the integral
dx_sub = (b-a) / n_sub
do j=1,n_sub
ab_sub(1,j) = a + (j-1)*dx_sub
ab_sub(2,j) = a + j*dx_sub
call MPI_SEND(ab_sub(1,j), 2, MPI_DOUBLE_PRECISION, j, j, &
MPI_COMM_WORLD, ierr)
enddo
! Master recieves part of the integral from the Workers and stores it in
! an array from any process
do j=1,n_sub
call MPI_RECV(int_sub, 1, MPI_DOUBLE_PRECISION, &
MPI_ANY_SOURCE, MPI_ANY_TAG, &
MPI_COMM_WORLD, status, ierr)
jj = status(MPI_TAG)
int_array(jj) = int_sub
enddo
int_approx = sum(abs(int_array))
endif
! Process 0 (Master) sends other processes (Workers) left and right edges
! of the integral
if(proc_num .GT. 0) then
call MPI_RECV(sub, 2, MPI_DOUBLE_PRECISION, &
0, MPI_ANY_TAG, &
MPI_COMM_WORLD, status, ierr)
int_sub = trapezoid(f,sub(1),sub(2),n)
j = status(MPI_TAG) ! this is the row number
! (should agree with proc_num)
call MPI_SEND(int_sub, 1, MPI_DOUBLE_PRECISION, &
0, j, MPI_COMM_WORLD, ierr)
! print the number of function evaluations by each thread:
print '("fevals by Process ",i2,": ",i13)', proc_num, fevals_proc
endif
! Master Process prints out the result:
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for all processes to print
if (proc_num .EQ. 0) then
print '("Trapezoid approximation with ",i8," total points: ",es22.14)',&
n_sub*n, int_approx
endif
call MPI_FINALIZE(ierr)
end program test2
2.8. Project 2: 20D Integral and the Monte Carlo Method¶
functions.f90, quadrature_mc.f90, random_util.f90, test_quad_mc.f90, Makefile (compute integral using Monte Carlo method for a 20 dimensional integral - serial)
plot_mc_quad_error.py (plot the result)
OBJECTS = random_util.o functions.o quadrature_mc.o test_quad_mc.o
FFLAGS =
.PHONY: test plot clean clobber
test: test.exe
./test.exe
test.exe: $(OBJECTS)
gfortran $(FFLAGS) $(OBJECTS) -o test.exe
%.o : %.f90
gfortran $(FFLAGS) -c $<
mc_quad_error.txt: test.exe
./test.exe
plot: mc_quad_error.txt
python plot_mc_quad_error.py
clean:
rm -f *.o *.exe *.mod
clobber: clean
rm -f mc_quad_error.txt mc_quad_error.png
! ---
! This module returns the value of a function:
! sum(x**2) over all dimensions
! it also sets the number of evaulations of this function
! ---
module functions
! ---
! gevals is a module variable
! Example use: use functions, only gevals
! ---
implicit none
integer :: gevals
save
contains
! ---
! Multiple dimension function
! Example use: f = g(xin,ndim)
! Inputs: x (vector of length ndim), ndim (number of dimensions)
! Output: g (the value of the function over all dimensions)
! ---
function g(x,ndim)
! ---
! Declare parameters and vectors
! ---
implicit none
integer, intent(in) :: ndim
real(kind=8), intent(in) :: x(ndim)
real(kind=8) :: g
integer :: i
! ---
! We need to initialise the number of evaluations to zero
! ---
g = 0.d0
! ---
! Evaluate function in 20 dimensions
! ---
do i=1,ndim
g = g + x(i)**2
enddo
! ---
! Update the number of function evaluations
! ---
gevals = gevals + 1
end function g
end module functions
! ---
! This module performs Monte Carlo Integration by:
! 1) Generating random input over a number of points
! 2) Evaluating the function at these points for 20 dimensions
! 3) Summing the function and dividing by the number of points for the average
! 4) Multiplying the average by the base for the integral
! ---
module quadrature_mc
! ---
! Uses random seed and functions modules
! ---
use random_util, only: init_random_seed
use functions, only: g
contains
! ---
! Monte Carlo Integration function
! Example use: int_mc = quad_mc(g,a,b,ndim,npoints)
! Inputs: g (from functions module), a and b (vectors of length ndim)
! ndim (number of dimensions) npoints (number of points)
! Output: quad_mc (the value of the integral)
! ---
function quad_mc(g, a, b, ndim, npoints)
! ---
! Declare parameters and arrays.
! ---
implicit none
real(kind=8), external :: g
real(kind=8), dimension(ndim), intent(in) :: a, b
integer, intent(in) :: ndim, npoints
integer :: seed1, i, j
real(kind=8), dimension(:), allocatable :: f, xin, x
real(kind=8) :: quad_mc
! ---
! Number of points & dimensions is unknown, so must allocate:
! ---
allocate(x(npoints)) ! random number array 0 to 1
allocate(xin(ndim)) ! random number array a+(b-a)x
allocate(f(npoints)) ! result of sum(x**2) over all dimensions
! for n points
! ---
! Generate random number array x once per call of quad_mc
! Length of the array is the number of random points
! ---
seed1 = 12345 ! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)
! ---
! Generate random input based on a and b and
! compute sum(x**2) for all dimensions
! ---
do i=1,npoints
do j=1,ndim
xin(j)=a(j)+(b(j)-a(j))*x(i)
enddo
f(i) = g(xin,ndim)
enddo
! ---
! Return the value of the Monte Carlo Integral by multiplying the
! average value by the base
! ---
quad_mc = product(b-a)*(sum(f)/npoints)
end function quad_mc
end module quadrature_mc
! ---
! This module returns seed for the random_number function
! ---
module random_util
contains
subroutine init_random_seed(seed1)
! ---
! Declare the parameters
! ---
integer, intent(inout) :: seed1
integer :: clock
integer, dimension(:), allocatable :: seed
! ---
! Determine how many numbers needed to seed and allocate
! ---
call random_seed(size = nseed)
allocate(seed(nseed))
! ---
! If seed1 = 0 then set seed1 randomly using the system clock.
! This will give different sequences of random numbers
! from different runs, so results are not reproducible.
! ---
if (seed1 == 0) then
call system_clock(count = clock)
seed1 = clock
endif
! ---
! If seed1 > 0 then results will be reproducible since calling this
! twice with the same seed will initialize the random
! number generator so the same sequence is generated.
! Once seed1 is set, set the other elements of the seed array by adding
! multiples of 37 as suggested in the documentation.
! ---
do i=1,nseed
seed(i) = seed1 + 37*(i-1)
enddo
! ---
! Seed the generator
! ---
call random_seed(put = seed)
! ---
! Deallocate the seed
! ---
deallocate(seed)
end subroutine init_random_seed
end module random_util
! ---
! This program computes the Monte Carlo Integral by doubling the number
! of points every time
! ---
program test_quad_mc
! ---
! Uses functions, quadrature_mc and random_util modules
! ---
use functions, only: g, gevals
use quadrature_mc, only: quad_mc
use random_util, only: init_random_seed
! ---
! Declare the parameters
! ---
implicit none
integer, parameter :: ndim = 20
real(kind=8), dimension(ndim) :: a, b
real(kind=8) :: volume, int_true, int_mc, int_mc_total, error
integer :: i, seed1, n_total, npoints
! ---
! Need to initialize number of g evaulations to zero, because it's a module variable
! ---
gevals = 0
! ---
! Open a file to write to
! ---
open(unit=25, file='mc_quad_error.txt', status='unknown')
! ---
! a and b are the vectors for the integral limits in all dimensions
! and are the same in all directions
! ---
do i=1,ndim
a(i) = 2.d0
b(i) = 4.d0
enddo
! ---
! Compute the true integral for special case where
! g(x) = sum(x**2) over all dimensions
! True integral = base * height,
! where height is the sum of the average value of the true integral in all dimensions
! ---
volume = product(b-a) ! = product of b(i)-a(i) of ndim dimensions
int_true = (volume) * sum((b**3 - a**3) / (3.d0*(b-a)))
print '("Testing Monte Carlo quadrature in ",i2," dimensions")', ndim
print '("True integral: ", es22.14)', int_true
! ---
! Start with Monte Carlo using only a few points.
! ---
npoints = 10
! ---
! Loop to successively and double the number of points used:
! ---
do i=1,17
! ---
! Compute integral using Monte Carlo method for a given set of points
! compute relative error and write these to file with number of points
! ---
int_mc = quad_mc(g,a,b,ndim,npoints)
error = abs(int_mc - int_true) / abs(int_true)
write(25,'(i10,e23.15,e15.6)') npoints, int_mc, error
! ---
! Double the number of points
! ---
npoints = 2*npoints
enddo
print '("Final approximation to integral: ",es22.14)',int_mc
print *, "Total g evaluations: ",gevals
end program test_quad_mc
from pylab import *
# read in three columns from file and unpack into 3 arrays:
n,int_approx,error = loadtxt('mc_quad_error.txt',unpack=True)
figure(1)
clf()
loglog(n,error,'-o',label='Monte-Carlo')
loglog([1,1e7],[1,sqrt(1e-7)],'k',label='1 / sqrt(N)')
legend()
xlabel('number of MC points used')
ylabel('abs(error)')
title('Log-log plot of relative error in MC quadrature')
savefig('mc_quad_error.png')
2.9. Project 3: Laplace’s Equation and the Monte Carlo Method¶
laplace_mc.py (python version)
problem_description.f90, laplace_mc.f90, mc_walk.f90, random_util.f90, Makefile (compute the solution to Laplace’s Equation using the Monte Carlo Method - serial)
test1.f90, test2.f90 (trials I made along the way)
plot_mc_quad_error.py (plot the result)
"""
Random walk approximate solution to Laplace's equation u_{xx} + u{yy} = 0.
Set demo==True to plot a few random walks interactively.
With demo==False many more walks are used to estimate the solution.
The boundary conditions are set for this test problem by evaluating the
true solution u(x,y) = x^2 - y^2 of Laplace's equation on the
boundary of the domain.
Moreover, the exact solution to the discrete equations
U_{i-1,j} + U_{i+1,j} + U_{i,j-1} + U_{i,j+1} - 4u_{ij} = 0
with boundary values obtained in this way is easily computed. It is simply
given by evaluating the exact solution at the grid points,
U_{ij} = x_i^2 - y_j^2
This is because the centered difference approximation to the second
derivative is exact when applied to a quadratic function.
This code implements a random walk on a lattice (rectangular grid) where in
each step the walk goes in one of 4 directions based on the value of a
random number that's uniformly distributed in [0,1].
"""
from pylab import *
from numpy.random import RandomState
import time, sys
# Set plot_walk to:
# True ==> 10 walks will be done, plotting each for illustration.
# False ==> Many more walks taken to investigate convergence.
plot_walk = False
# problem description:
ax = 0.
bx = 1.
ay = 0.4
by = 1.
nx = 19
ny = 11
dx = (bx-ax)/float(nx+1)
dy = (by-ay)/float(ny+1)
debug = False # to turn on some additional printing
def utrue(x,y):
"""
Return true solution for a test case where this is known.
This function should satisfy u_{xx} + u_{yy} = 0.
"""
utrue = x**2 - y**2
return utrue
def uboundary(x,y):
"""
Return u(x,y) if (x,y) is on the boundary.
"""
if (x-ax)*(x-bx)*(y-ay)*(y-by) != 0:
print "*** Not on the boundary"
raise Exception("*** uboundary called at non-boundary point")
return nan
# assume we know the true solution and can just evaluate there:
ub = utrue(x,y)
return ub
def plot_initialize(i0,j0):
"""
Set up plot for demo.
"""
figure(1,figsize=(7,8))
clf()
axes([.1,.3,.8,.6]) # leave room for printing messages
x = linspace(ax,bx,nx+2)
y = linspace(ay,by,ny+2)
X,Y = meshgrid(x,y) # turn into 2d arrays
plot(X,Y,'k')
plot(X.T,Y.T,'k')
plot([ax+i0*dx],[ay+j0*dy],'ro')
axis('scaled')
title("Initial point for random walk")
draw()
show()
time.sleep(1) # pause to see the initial location
def plot_ub(xb,yb,ub):
"""
Called when we hit the boundary
"""
plot([xb], [yb], 'ro')
text(ax, ay-0.2, 'Hit boundary where u = %9.6f' % ub, fontsize=20)
draw()
time.sleep(2)
def plot_step(iold, jold, i, j, istep):
"""
Called each step of the random walk to plot progress
"""
# plot next segment of walk and new point in red:
plot([ax+iold*dx, ax+i*dx], [ay+jold*dy, ay+j*dy], 'r',linewidth=2)
plot([ax+i*dx], [ay+j*dy], 'ro')
title("After %6i random steps" % (istep+1))
draw()
time.sleep(0.05)
# redraw last segment and point in blue:
plot([ax+iold*dx, ax+i*dx], [ay+jold*dy, ay+j*dy], 'b',linewidth=2)
plot([ax+i*dx], [ay+j*dy], 'bo')
draw()
def random_walk(i0, j0, max_steps):
"""
Take one random walk starting at (i0,j0) until we reach the boundary or
exceed max_steps steps.
Return the value at the boundary point reached, or nan if we failed.
"""
# starting point:
i = i0
j = j0
if plot_walk:
plot_initialize(i,j)
# generate as many random numbers as we could possibly need
# for this walk, since this is much faster than generating one at a time:
r = random_generator.uniform(0., 1., size=max_steps)
if debug:
print "+++ generated r: ", r
for istep in range(max_steps):
iold,jold = i,j # needed for plotting only
# Take the next random step with equal probability in each
# direction:
if r[istep] < 0.25:
i = i-1 # step left
elif r[istep] < 0.5:
i = i+1 # step right
elif r[istep] < 0.75:
j = j-1 # step down
else:
j = j+1 # step up
if plot_walk:
plot_step(iold,jold, i,j, istep)
# check if we hit the boundary:
if i*j*(nx+1-i)*(ny+1-j) == 0:
xb = ax + i*dx
yb = ay + j*dy
ub = uboundary(xb, yb)
if plot_walk:
plot_ub(xb,yb,ub)
if debug:
print "Hit boundary at (%7.3f, %7.3f) after %i7 steps, ub = %7.3f" \
% (xb,yb,istep+1,ub)
break # end the walk
if istep==(max_steps-1):
if debug:
print "Did not hit boundary after %i steps" % max_steps
if plot_walk:
text(0, -0.2, "Did not hit boundary after %i steps" \
% max_steps, fontsize=20)
draw()
time.sleep(2)
ub = nan
return ub
def many_walks(i0, j0, max_steps, n_mc):
ub_sum = 0. # to accumulate boundary values reached from all walks
n_success = 0 # to keep track of how many walks reached boundary
for k in range(n_mc):
i = i0
j = j0
ub = random_walk(i0, j0, max_steps)
if not isnan(ub):
# use this result unless walk didn't reach boundary
ub_sum += ub
n_success += 1
u_mc = ub_sum / n_success # average over successful walks
return u_mc, n_success
if __name__ == "__main__":
# MAIN PROGRAM
# Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6
i0 = round((x0-ax)/dx)
j0 = round((y0-ay)/dy)
# shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy
u_true = utrue(x0,y0)
print "True solution of PDE: u(%7.3f, %7.3f) = %9.5f" % (x0,y0,u_true)
print "Note: with solution used in demo this is also the solution to the"
print " the finite-difference equations on the same grid."
ans = raw_input("Enter seed for random generator or <return> .... ")
if ans == "":
seed = None # will cause a random seed to be used
else:
seed = int(ans) # convert string returned from raw_input into integer
random_generator = RandomState(seed)
# maximum number of step in each before giving up:
max_steps = 100*max(nx, ny)
# initial number of Monte-Carlo walks to take:
n_mc = 10
u_mc, n_success = many_walks(i0, j0, max_steps, n_mc)
error = abs((u_mc - u_true) / u_true)
print "After %8i random walks, u = %15.9f, rel. error = %15.6e" \
% (n_success, u_mc, error)
if plot_walk:
sys.exit() # quit after a few walks
# start accumulating totals:
u_mc_total = u_mc
n_total = n_success
for i in range(12):
u_sum_old = u_mc_total * n_total
u_mc, n_success = many_walks(i0, j0, max_steps, n_mc)
u_sum_new = u_mc * n_success
n_total = n_total + n_success
u_mc_total = (u_sum_old + u_sum_new) / n_total
error = abs((u_mc_total - u_true) / u_true)
print "After %8i random walks, u = %15.9f, rel. error = %15.6e" \
% (n_total, u_mc_total, error)
n_mc = 2*n_mc # double number of trials for next iteration
OBJECTS = random_util.o problem_description.o mc_walk.o test1.o
OBJECTS2 = random_util.o problem_description.o mc_walk.o test2.o
OBJECTS3 = random_util.o problem_description.o mc_walk.o laplace_mc.o
FFLAGS =
.PHONY: test test2 laplace
test: test.exe
./test.exe
test2: test2.exe
./test2.exe
laplace: laplace.exe
./laplace.exe
test.exe: $(OBJECTS)
gfortran $(FFLAGS) $(OBJECTS) -o test.exe
test2.exe: $(OBJECTS2)
gfortran $(FFLAGS) $(OBJECTS2) -o test2.exe
laplace.exe: $(OBJECTS3)
gfortran $(FFLAGS) $(OBJECTS3) -o laplace.exe
%.o : %.f90
gfortran $(FFLAGS) -c $<
clean:
rm -f *.o *.exe *.mod
module problem_description
implicit none
real(kind=8), parameter :: ax = 0.0d0
real(kind=8), parameter :: bx = 1.0d0
real(kind=8), parameter :: ay = 0.4d0
real(kind=8), parameter :: by = 1.0d0
integer, parameter :: nx = 19
integer, parameter :: ny = 11
real(kind=8), parameter :: dx = (bx - ax) / (nx+1)
real(kind=8), parameter :: dy = (by - ay) / (ny+1)
save
contains
function utrue(x, y)
! True solution for comparison, if known.
implicit none
real(kind=8), intent(in) :: x,y
real(kind=8) :: utrue
utrue = x**2 - y**2
end function utrue
function uboundary(x, y)
! Return u(x,y) assuming (x,y) is a boundary point.
implicit none
real(kind=8), intent(in) :: x,y
real(kind=8) :: uboundary
if ((x-ax)*(x-bx)*(y-ay)*(y-by) .ne. 0.d0) then
print *, "*** Error -- called uboundary at non-boundary point"
stop
endif
uboundary = utrue(x,y) ! assuming we know this
end function uboundary
end module problem_description
program laplace_mc
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, many_walks, nwalks
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy
real(kind=8) :: u_mc, x0, y0, rel_error
integer :: n_mc, n_success, i0, j0, max_steps
! Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6
i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)
! shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy
u_true = utrue(x0,y0)
!nwalks = 0
n_mc = 10
max_steps = 100*max(nx, ny)
! do i=1,13
! call many_walks(i0, j0, max_steps, n_mc, u_mc, n_success)
! ---
! Open a file to write to
! ---
open(unit=25, file='laplace_mc.txt', status='unknown')
!rel_error = abs(u_true-u_mc)/u_true
print *, ""
print '(" True solution of PDE at x =",f10.4," y =",f10.4)', x0, y0
print '(" u_true = ",es13.3)', u_true
print *, ""
print *, " Walks Monte Carlo Solution Relative Error"
do i=1,13
nwalks=0
call many_walks(i0, j0, max_steps, n_mc, u_mc, n_success)
n_mc = 2.0d0 * n_mc
rel_error = abs(u_true-u_mc)/u_true
print '(" ",i8," ",es13.3," ",es13.3)', nwalks, u_mc, rel_error
write(25,'(i10,e23.15,e15.6)') nwalks, u_mc, rel_error
enddo
end program laplace_mc
! ---
! This has two functions random_walk and many_walks
! ---
module mc_walk
! ---
! Uses random seed module
! ---
use random_util, only: init_random_seed
use problem_description, only: uboundary, ax, ay, dx, dy, nx, ny
! ---
! nwalks is a module variable
! Example use: use mc_walk, only nwalks
! ---
implicit none
integer :: nwalks
!integer :: iabort
save
contains
! ---
! A single random walk
! Inputs: i (step in x), j (step in y), max_steps (maximum number of steps)
! Outputs: ub (value at boundary), iabort (0 for success, 1 for failure)
! Example use: ub, iabort = random_walk(0, 0, 100, ub, iabort)
!
! i: 0, 1,.. nx, nx+1
! j:
! 0, (BC) (BC) (BC) (BC)
! 1, (BC) (BC)
! ... (BC) (BC)
! nj (BC) (BC)
! nj+1 (BC) (BC) (BC) (BC)
! ---
subroutine random_walk(i0, j0, max_steps, ub, iabort)
! ---
! Declare parameters
! ---
implicit none
integer, intent(in) :: i0, j0, max_steps ! i0 and j0 are inputs (static)
integer :: i, j ! i and j are dummy variables
real(kind=8), intent(out) :: ub
integer, intent(out) :: iabort
real(kind=8), dimension(:), allocatable :: x
integer :: istep, seed1
real(kind=8) :: xb, yb
! ---
! Number of steps is unknown, so we must allocate:
! ---
allocate(x(max_steps)) ! random number array 0 to 1
! ---
! Generate random number array x (0 to 1)
! ---
seed1 = 0 ! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)
i = i0
j = j0
!nwalks = 0
! ---
! Go through the vector of random numbers and
! make moves depending on the value of x
! ---
do istep=1,max_steps
if (x(istep) .LT. 0.25) then
i = i-1 ! go left
! print *, "go left"
! print *, "i=", i
! print *, "istep=", istep
! print *, "x=", x(istep)
else if (x(istep) .LT. 0.5) then
i = i+1 ! go right
! print *, "go right"
! print *, "i=", i
! print *, "istep=", istep
! print *, "x=", x(istep)
else if (x(istep) .LT. 0.75) then
j = j-1 ! go down
! print *, "go up"
! print *, "j=", j
! print *, "istep=", istep
! print *, "x=", x(istep)
else
j = j+1 ! go up
! print *, "go down"
! print *, "j=", j
! print *, "istep=", istep
! print *, "x=", x(istep)
endif
! ---
! What if we are at the boundary?
! ---
if (i*j*(nx+1 - i)*(ny+1 - j) .EQ. 0) then
! print *, "we are at the boundary"
! print *, " "
xb = ax + i*dx ! x is ax or ax+i*dx
yb = ay + j*dy ! y is ay or ay+j*dy
! print *, "xb=",xb
! print *, "yb=",yb
ub = uboundary(xb, yb) ! the boundary value is then computed
iabort = 0 ! success
go to 99 ! end do loop
endif
! ---
! What if we never reach the boundary within max steps?
! ---
if (istep .EQ. max_steps) then
! print *, "we aborted"
! print *, " "
ub = 0 ! set ub to zero
iabort = 1 ! failure
endif
enddo
99 continue
end subroutine random_walk
subroutine many_walks(i0, j0, max_steps, n_mc, u_mc, n_success)
! ---
! Declare parameters
! ---
implicit none
integer, intent(in) :: i0, j0, max_steps ! i0 and j0 are inputs (static)
integer, intent(in) :: n_mc
integer :: i, j, k, iabort !, nwalks ! i and j are dummy variables
real(kind=8), intent(out) :: u_mc
integer, intent(out) :: n_success
real(kind=8) :: ub_sum, ub
ub_sum = 0
n_success = 0
!nwalks = 0
do k=1,n_mc
!nwalks = 0
i = i0
j = j0
call random_walk(i, j, max_steps, ub, iabort)
if(iabort .NE. 1) then
ub_sum = ub_sum + ub
n_success = n_success + 1
nwalks = nwalks +1
endif
u_mc = ub_sum/n_success
enddo
end subroutine many_walks
end module mc_walk
! ---
! This module returns seed for the random_number function
! ---
module random_util
contains
subroutine init_random_seed(seed1)
! ---
! Declare the parameters
! ---
integer, intent(inout) :: seed1
integer :: clock
integer, dimension(:), allocatable :: seed
! ---
! Determine how many numbers needed to seed and allocate
! ---
call random_seed(size = nseed)
allocate(seed(nseed))
! ---
! If seed1 = 0 then set seed1 randomly using the system clock.
! This will give different sequences of random numbers
! from different runs, so results are not reproducible.
! ---
if (seed1 == 0) then
call system_clock(count = clock)
seed1 = clock
endif
! ---
! If seed1 > 0 then results will be reproducible since calling this
! twice with the same seed will initialize the random
! number generator so the same sequence is generated.
! Once seed1 is set, set the other elements of the seed array by adding
! multiples of 37 as suggested in the documentation.
! ---
do i=1,nseed
seed(i) = seed1 + 37*(i-1)
enddo
! ---
! Seed the generator
! ---
call random_seed(put = seed)
! ---
! Deallocate the seed
! ---
deallocate(seed)
end subroutine init_random_seed
end module random_util
program test1
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, nwalks
real(kind=8) :: ub
integer :: iabort, i, j, max_steps
integer, parameter :: nx = 19
integer, parameter :: ny = 11
! range of i is 1 to nx
! range of j is 1 to nj
i = 1
j = 1
max_steps = 100*max(nx, ny)
call random_walk(i, j, max_steps, ub, iabort)
print *, "ub=", ub
print *, "iabort=", iabort
print *, "nwalks=", nwalks
end program test1
program test2
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, many_walks, nwalks
real(kind=8) :: u_mc
integer :: n_mc, n_success, i, j, max_steps
integer, parameter :: nx = 19
integer, parameter :: ny = 11
! range of i is 1 to nx
! range of j is 1 to nj
i = 1
j = 1
max_steps = 100*max(nx, ny)
call many_walks(i, j, max_steps, n_mc, u_mc, n_success)
print *, "u_mc=", u_mc
print *, "n_success=", n_success
end program test2
from pylab import *
# read in three columns from file and unpack into 3 arrays:
n,int_approx,error = loadtxt('laplace_mc.txt',unpack=True)
figure(1)
clf()
loglog(n,error,'-o',label='Monte-Carlo')
loglog([1,1e7],[1,sqrt(1e-7)],'k',label='1 / sqrt(N)')
legend()
xlabel('number of MC points used')
ylabel('abs(error)')
title('Log-log plot of relative error in MC quadrature')
savefig('mc_quad_error.png')
2.10. Project 4: Laplace’s Equation, MPI and the Monte Carlo Method¶
problem_description.f90, laplace_mc.f90, mc_walk.f90, random_util.f90, Makefile (compute the solution to Laplace’s Equation using the Monte Carlo Method - using MPI)
test1.f90, test1b.f90, test2.f90 (trials I made along the way)
plot_mc_quad_error.py (plot the result)
# $MYHPSC/project/part4/Makefile
OBJECTS = random_util.o problem_description.o mc_walk.o laplace_mc.o
OBJECTS2 = random_util.o problem_description.o mc_walk.o test1.o
OBJECTS2b = random_util.o problem_description.o mc_walk.o test1b.o
OBJECTS3 = random_util.o problem_description.o mc_walk.o test2.o
NUMPROCS ?= 4
FLAGS ?=
.PHONY: mpi laplace test1 test1b test2
mpi: testquad.exe
mpiexec -n $(NUMPROCS) ./testquad.exe
test1: test1.exe
./test1.exe
test1b: test1b.exe
./test1b.exe
test2: test2.exe
./test2.exe
testquad.exe: $(OBJECTS)
mpif90 $(FLAGS) $(OBJECTS) -o testquad.exe
test1.exe: $(OBJECTS2)
gfortran $(FFLAGS) $(OBJECTS2) -o test1.exe
test1b.exe: $(OBJECTS2b)
gfortran $(FFLAGS) $(OBJECTS2b) -o test1b.exe
test2.exe: $(OBJECTS3)
gfortran $(FFLAGS) $(OBJECTS3) -o test2.exe
%.o : %.f90
mpif90 -c $(FLAGS) $<
laplace: laplace.exe
./laplace.exe
laplace.exe: $(OBJECTS)
gfortran $(FFLAGS) $(OBJECTS) -o laplace.exe
#%.o : %.f90
# gfortran $(FFLAGS) -c $<
clean:
rm -f *.o *.exe *.mod
! ---
! This module returns the true value and the value at the boundary
! ---
module problem_description
! ---
! Initilize parameters
! ---
implicit none
real(kind=8), parameter :: ax = 0.0d0
real(kind=8), parameter :: bx = 1.0d0
real(kind=8), parameter :: ay = 0.4d0
real(kind=8), parameter :: by = 1.0d0
integer, parameter :: nx = 19
integer, parameter :: ny = 11
real(kind=8), parameter :: dx = (bx - ax) / (nx+1)
real(kind=8), parameter :: dy = (by - ay) / (ny+1)
save
contains
! ---
! Function for the true solution (if known)
! Inputs: x and y
! Output: z
! ---
function utrue(x, y)
implicit none
real(kind=8), intent(in) :: x,y
real(kind=8) :: utrue
utrue = x**2 - y**2
end function utrue
! ---
! Function for the value at the boundary
! Inputs: x and y
! Output: z
! ---
function uboundary(x, y)
implicit none
real(kind=8), intent(in) :: x,y
real(kind=8) :: uboundary
! ---
! If we are not at the boundary we print an error and stop
! ---
if ((x-ax)*(x-bx)*(y-ay)*(y-by) .ne. 0.d0) then
print *, "*** Error -- called uboundary at non-boundary point"
stop
endif
! ---
! If we are at the boundary we compute the solution
! ---
uboundary = utrue(x,y) ! assuming we know this
end function uboundary
end module problem_description
! ---
! This program loops through the Monte Carlo solution of Laplace's equation,
! doubling the number of sample points every loop using MPI
! ---
program laplace_mc
! ---
! ALL PROCESSES: Modules and functions
! ---
use mpi
use random_util, only: init_random_seed
use mc_walk, only: random_walk, many_walks, n_walks, x, n_success_proc
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy
! ---
! ALL PROCESSES: Parameters, vectors and arrays
! ---
implicit none
real(kind=8) :: u_mc, x0, y0, rel_error, u_true
integer :: n_mc, n_success, i0, j0, max_steps, i, &
process_num, num_processes, ierr, &
istep, seed1, n_success_total
logical :: debug_many
! ---
! ALL PROCESSES: Number of steps is unknown, so we must allocate -
! 200000 is the maximum number of moves
! ---
n_mc = 10 ! This must be at least 3 as there are 3 processes
! Use n_mc = 10 if not debugging random_walk
max_steps = 200000*max(nx, ny)
allocate(x(max_steps)) ! random number array 0 to 1
! ----
! ALL PROCESSES: Initialize MPI
! ----
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_processes, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_num, ierr)
! ---
! ALL PROCESSES: Generate random number array x (0 to 1)
! ---
seed1 = 12345 ! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
seed1 = seed1 + 97*process_num ! unique for each process
print '(" Seed for process",i2, " is ",i5)', process_num, seed1
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for processes to print
call init_random_seed(seed1)
call random_number(x)
! ----
! ALL PROCESSES: Try it out from a specific (x0,y0):
! ----
x0 = 0.9
y0 = 0.6
i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)
! ----
! ALL PROCESSES: Shift (x0,y0) to a grid point if it wasn't already:
! ----
x0 = ax + i0*dx
y0 = ay + j0*dy
u_true = utrue(x0,y0)
n_walks = 0
! ---
! MASTER PROCESS: Open a file to write to, provide true solution and print output headers
! ---
if (process_num .EQ. 0) then
open(unit=25, file='laplace_mc.txt', status='unknown')
print '(" True solution of PDE at x =",f10.4," y =",f10.4)', x0, y0
print '(" u_true =",e23.15)', u_true
print *, ""
print *, " Successes Monte Carlo Solution Relative Error"
endif
! ---
! ALL PROCESSES: Loop through and call many_walks, doubling the number
! of Monte Carlo points by two each time
! ---
! MASTER PROCESS: Compute relative error and print output of many_walks
! ---
debug_many = .False. ! Best to remove the do loop if you're trying this
do i=1,13
call many_walks(i0, j0, max_steps, n_mc, u_mc, n_success, debug_many)
n_mc = 2.0d0 * n_mc
if (process_num .EQ. 0) then
rel_error = abs(u_true-u_mc)/u_true
print '(" ",i10," ",e23.15," ",e23.15)', n_success, u_mc, rel_error
write(25,'(i10,e23.15,e15.6)') n_success, u_mc, rel_error
endif
enddo
! ---
! ALL PROCESSES: Wait for process 0 to print
! ---
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
! ---
! MASTER PROCESS: Reduce n_success_proc to n_success_total
! ---
call MPI_REDUCE(n_success_proc, n_success_total, 1, &
MPI_DOUBLE_PRECISION, MPI_SUM, 0, MPI_COMM_WORLD, ierr)
! ---
! MASTER PROCESS: Reduce n_success_proc to n_success_total
! ---
if (process_num .EQ. 0) then
print '("Final approximation to u(x0,y0):",e23.15)', u_mc
print '("Total number of walks by all processes:",i10)', n_success_total
endif
! ---
! ALL PROCESSES: Wait for process 0 to print
! ---
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
! ---
! ALL PROCESSES: Print out process number and number of successes
! ---
print '("Walks by process ",i1,":",i8)', process_num, n_success_proc
! ----
! ALL PROCESSES: COMPLETE MPI
! ----
call MPI_FINALIZE(ierr)
end program laplace_mc
! ---
! This module performs a random walk and also many random walks
! ---
module mc_walk
! ---
! ALL PROCESSES: Use random seed module & mpi
! ---
use mpi
use problem_description, only: uboundary, ax, ay, dx, dy, nx, ny
! ---
! ALL PROCESSES: nwalks & n_success_proc are module variables
! Example use: use mc_walk, only nwalks, n_success_proc
! ---
implicit none
integer :: n_walks, n_success_proc
real(kind=8), dimension(:), allocatable :: x
save
contains
! ---
! ALL PROCESSES: A single random walk
! Inputs: i0 (step in x), j0 (step in y), max_steps (maximum number of steps)
! Outputs: ub (value at boundary), iabort (0 for success, 1 for failure)
! Example use: random_walk(0, 0, 100, ub, iabort)
!
! i: 0, 1,.. nx, nx+1
! j:
! 0, (BC) (BC) (BC) (BC)
! 1, (BC) (BC)
! ... (BC) (BC)
! nj (BC) (BC)
! nj+1 (BC) (BC) (BC) (BC)
! ---
subroutine random_walk(i0, j0, max_steps, ub, iabort, debug_random)
! ---
! ALL PROCESSES: Declare parameters
! ---
implicit none
integer, intent(in) :: i0, j0, max_steps
logical, intent(in) :: debug_random
real(kind=8), intent(out) :: ub
integer, intent(out) :: iabort
integer :: i, j, istep, seed1, start
real(kind=8) :: xb, yb
! ---
! ALL PROCESSES: Initialize the walk from it's starting point
! ---
i = i0
j = j0
start = n_walks
! ---
! ALL PROCESSES: Go through the vector of random numbers and
! make moves depending on the value of x
! ---
do istep=start,max_steps
if (x(istep) .LT. 0.25) then
i = i-1 ! go left
if (debug_random) then
print *, "go left"
print *, "i=", i
print *, "istep=", istep
print *, "x=", x(istep)
endif
n_walks = n_walks + 1
else if (x(istep) .LT. 0.5) then
i = i+1 ! go right
if (debug_random) then
print *, "go right"
print *, "i=", i
print *, "istep=", istep
print *, "x=", x(istep)
endif
n_walks = n_walks + 1
else if (x(istep) .LT. 0.75) then
j = j-1 ! go down
if (debug_random) then
print *, "go up"
print *, "j=", j
print *, "istep=", istep
print *, "x=", x(istep)
endif
n_walks = n_walks + 1
else
j = j+1 ! go up
if (debug_random) then
print *, "go down"
print *, "j=", j
print *, "istep=", istep
print *, "x=", x(istep)
endif
n_walks = n_walks + 1
endif
! ---
! ALL PROCESSES: What if we are at the boundary?
! ---
if (i*j*(nx+1 - i)*(ny+1 - j) .EQ. 0) then
xb = ax + i*dx ! x is ax or ax+i*dx
yb = ay + j*dy ! y is ay or ay+j*dy
ub = uboundary(xb, yb) ! the boundary value is then computed
iabort = 0 ! success
n_success_proc = n_success_proc + 1
if (debug_random) then
print *, "Reached the boundary"
endif
go to 99 ! end do loop
endif
! ---
! ALL PROCESSES: What if we never reach the boundary within max steps?
! ---
if (istep .EQ. max_steps) then
ub = 0 ! set ub to zero
iabort = 1 ! failure
n_success_proc = n_success_proc
if (debug_random) then
print *, "Failed to reach boundary"
endif
endif
enddo
99 continue
end subroutine random_walk
! ---
! ALL PROCESSES: A single random walk
! Inputs: i0 (step in x), j0 (step in y), max_steps (maximum number of steps)
! Outputs: ub (value at boundary), iabort (0 for success, 1 for failure)
! Example use: many_walks(0, 0, 100, ub, iabort)
! ---
subroutine many_walks(i0, j0, max_steps, n_mc, u_mc, n_success, debug_many)
! ---
! ALL PROCESSES: Declare parameters
! ---
implicit none
integer, intent(in) :: i0, j0, max_steps, n_mc
real(kind=8), intent(out) :: u_mc
integer, intent(out) :: n_success
logical, intent(in) :: debug_many
real(kind=8) :: ub_sum, ub, answer_r
real(kind=8), allocatable, dimension(:) :: int_array
integer :: i, j, k, iabort, num_processes, process_num, &
ierr, m, r, sender, iabort_r, answer, num_sent, next_walk, jj
integer, dimension(MPI_STATUS_SIZE) :: status
logical :: debug_random
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_processes, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_num, ierr)
ub_sum = 0
n_success = 0
num_sent = 0
n_success_proc = 0
! ---
! MASTER PROCESS
! ---
if(process_num .EQ. 0) then
allocate(int_array(n_mc)) ! to hold answer from Workers
! ---
! INITIALIZE THE PROCESS
! Send: 0, Tag: 1, To: Processes 1 to 3
! Increment number sent by 1
! ---
do m=1,num_processes-1
call MPI_SEND(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, &
m, 1, MPI_COMM_WORLD, ierr)
num_sent = num_sent + 1
if (debug_many) then
print '("Process ",i1," sent tag = ",i1, " to process ",i1)',process_num,1,m
endif
enddo
if (debug_many) then
print '("Number of values sent initially ",i1)',num_sent
endif
! ---
! RECIEVE ALL POSSIBLE CALLS TO RANDOM WALK
! Receive: answer_r, Tag: iabort (0 or 1), From: Any Process (1 to 3)
! Sender is now free
! ---
do m=1,n_mc
call MPI_RECV(answer_r, 1, MPI_DOUBLE_PRECISION, &
MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, status, ierr)
iabort_r = status(MPI_TAG)
sender = status(MPI_SOURCE) ! Whichever process sent it is now free
! ---
! HAVE WE REACHED THE BOUNDARY?
! If the walk reached the boundary, iabort = 0, so add it to the array
! ---
if(iabort_r .EQ. 0) then
int_array(m) = answer_r
n_success = n_success + 1
else
int_array(m) = 0
n_success = n_success
endif
if (debug_many) then
print '("Process ",i1," recieved answer = ",es22.14," tag = ",i3, " from process ",i1)', &
process_num,int_array(m),iabort_r,sender
endif
! ---
! SEND ADDITIONAL CALLS TO RANDOM WALK
! If the number sent is less than the number of Monte Carlo points
! Send: 0, Tag: 1, To: Processes 1 to 3 (whichever is free)
! Increment number sent by 1
! ---
if (num_sent .LT. n_mc) then
call MPI_SEND(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, &
sender, 1, MPI_COMM_WORLD, ierr)
num_sent = num_sent + 1
if (debug_many) then
print '("Process ",i1," sent tag = ",i1, " to process ",i1)', &
process_num,1,sender
endif
! ---
! END WORKER PROCESSES
! Send: 0, Tag: 0, To: Processes 1 to 3 (whichever is free)
! ---
else
call MPI_SEND(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, &
sender, 0, MPI_COMM_WORLD, ierr)
endif
enddo
! ---
! SUM OVER THE ARRAY TO RETURN U_MC TO MANY_WALKS
! ---
ub_sum=sum(int_array) ! Summation of the values from each process
u_mc = ub_sum/n_success ! Returned to many_walks
if (debug_many) then
print '("Returned answer ",es22.14)', u_mc
print '("Number of values sent finally ",i10)', num_sent
endif
endif
! ---
! WORKER PROCESS
! ---
if(process_num .GT. 0) then
do while (.true.)
! ---
! RECIEVE CALLS FROM MASTER PROCESS
! Receive: 0, Tag: 1, From: Master (0)
! Sender is now free
! ---
call MPI_RECV(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, &
MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, status, ierr)
r = status(MPI_TAG)
if (debug_many) then
print '("Process ",i1," recieved tag = ",i1, " from process ",i1)',process_num,r,0
endif
! ---
! THERE IS ANOTHER WALK TO DO
! Call Random Walk again
! ---
if(r .EQ. 1) then
i = i0
j = j0
debug_random = .False. ! Change n_mc to 3 if debugging this
call random_walk(i, j, max_steps, ub, iabort, debug_random)
! ---
! SEND THE ANSWER TO THE MASTER
! Send: ub, Tag: iabort(0 or 1), To: Master (Process 0)
! ---
call MPI_SEND(ub, 1, MPI_DOUBLE_PRECISION, &
0, iabort, MPI_COMM_WORLD, ierr)
if (debug_many) then
print '("Process ",i1," sent answer = ",es22.14, " & tag = ",i3, " to process ",i1)', &
process_num,ub,iabort,0
endif
! ---
! THERE ISN'T ANOTHER WALK TO DO
! Worker finishes
! ---
else if(r .EQ. 0) then
if (debug_many) then
print '("Process ",i1," ended ")', process_num
endif
go to 100
endif
enddo
endif
! ----
! COMPLETE MPI
! ----
100 continue ! Jumps here if the Worker finishes
end subroutine many_walks
end module mc_walk
! ---
! This module returns seed for the random_number function
! ---
module random_util
contains
subroutine init_random_seed(seed1)
! ---
! Declare the parameters
! ---
integer, intent(inout) :: seed1
integer :: clock
integer, dimension(:), allocatable :: seed
! ---
! Determine how many numbers needed to seed and allocate
! ---
call random_seed(size = nseed)
allocate(seed(nseed))
! ---
! If seed1 = 0 then set seed1 randomly using the system clock.
! This will give different sequences of random numbers
! from different runs, so results are not reproducible.
! ---
if (seed1 == 0) then
call system_clock(count = clock)
seed1 = clock
endif
! ---
! If seed1 > 0 then results will be reproducible since calling this
! twice with the same seed will initialize the random
! number generator so the same sequence is generated.
! Once seed1 is set, set the other elements of the seed array by adding
! multiples of 37 as suggested in the documentation.
! ---
do i=1,nseed
seed(i) = seed1 + 37*(i-1)
enddo
! ---
! Seed the generator
! ---
call random_seed(put = seed)
! ---
! Deallocate the seed
! ---
deallocate(seed)
end subroutine init_random_seed
end module random_util
program test1
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, n_walks, x
use random_util, only: init_random_seed
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy
real(kind=8) :: ub, rel_error, x0, y0
integer :: iabort, i, j, max_steps, seed1, n_success, i0, j0
! ---
! Number of steps is unknown, so we must allocate:
! ---
max_steps = 100*max(nx, ny)
allocate(x(max_steps)) ! random number array 0 to 1
! Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6
i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)
! shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy
u_true = utrue(x0,y0)
! ---
! Generate random number array x (0 to 1)
! ---
seed1 = 12345 ! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)
i = i0
j = j0
n_success = 0
print *, ""
print '(" True solution of PDE at x =",f10.4," y =",f10.4)', x0, y0
print '(" i =",i8," j =",i8)', i, j
print '(" u_true = ",es13.3)', u_true
call random_walk(i, j, max_steps, ub, iabort)
print *, "ub=", ub
print *, "x=",x(1)
print *, "iabort=", iabort
print *, "nwalks=", n_walks
end program test1
program test1b
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, n_walks, x
use random_util, only: init_random_seed
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy
real(kind=8) :: ub, rel_error, x0, y0
integer :: iabort, i, j, max_steps, seed1, n_success, n_mc
! integer, parameter :: nx = 19
! integer, parameter :: ny = 11
! ---
! Set the number of samples for the Monte Carlo Solution
! ---
n_mc=10
! ---
! Number of steps is unknown, so we must allocate:
! ---
max_steps = 20*max(nx, ny)*n_mc
! max_steps = 10
allocate(x(max_steps)) ! random number array 0 to 1
! ---
! Generate random number array x (0 to 1)
! ---
seed1 = 12345 ! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)
! Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6
i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)
! shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy
u_true = utrue(x0,y0)
i = i0
j = j0
! ---
! Record the number of times we successfully hit the boundary
! ---
n_success = 0
! ---
! Record the number of random walks we make
! ---
n_walks = 0
! ---
! Loop through and call a single random walk each time, indexing
! You must start the walk using the last successful walk point
! ---
do k=1,n_mc 0
call random_walk(i, j, max_steps, ub, iabort)
if(iabort .NE. 1) then
ub_sum = ub_sum + ub
n_success = n_success + 1
endif
u_mc = ub_sum/n_success
print *, ""
print *, "ub=", ub
! print *, "x=",x(k)
print *, "iabort=", iabort
print *, "nwalks=", n_walks
print *, "n_success=", n_success
print *, "u_mc=", u_mc
enddo
end program test1b
program test2
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, many_walks, n_walks, x
use random_util, only: init_random_seed
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy
real(kind=8) :: u_mc, rel_error, x0, y0
integer :: n_mc, n_success, i, j, max_steps, seed1, k, i0, j0
! ---
! Number of steps is unknown, so we must allocate
! 200000 is the maximum number of moves
! ---
n_mc = 10
max_steps = 200000*max(nx, ny)
!max_steps = 25
allocate(x(max_steps)) ! random number array 0 to 1
! Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6
i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)
! shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy
u_true = utrue(x0,y0)
! ---
! Generate random number array x (0 to 1)
! ---
seed1 = 12345 ! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)
! ---
! Set the point of interest
! ---
i = i0
j = j0
! ---
! Record the number of random walks we make
! ---
n_walks = 0
print *, ""
print '(" True solution of PDE at x =",f10.4," y =",f10.4)', x0, y0
print '(" u_true = ",es13.3)', u_true
print *, ""
print *, " Successes Walks Monte Carlo Solution Relative Error"
do k=1,13
call many_walks(i, j, max_steps, n_mc, u_mc, n_success)
n_mc = 2 * n_mc
rel_error = abs(u_true-u_mc)/u_true
print '(" ",i8," ",i8," ",es13.3," ",es13.3)', &
n_success, n_walks, u_mc, rel_error
enddo
end program test2
from pylab import *
# read in three columns from file and unpack into 3 arrays:
n,int_approx,error = loadtxt('laplace_mc.txt',unpack=True)
figure(1)
clf()
loglog(n,error,'-o',label='Monte-Carlo')
loglog([1,1e7],[1,sqrt(1e-7)],'k',label='1 / sqrt(N)')
legend()
xlabel('number of MC points used')
ylabel('abs(error)')
title('Log-log plot of relative error in MC quadrature')
savefig('mc_quad_error.png')