# 2. HPSC MOOC¶

These projects are based on the course “High Performance Scientific Computing” by Professor Randall LeVeque.

• Workspace 1: Testing Scripts and Dummy Code
• test1.py (python script to test for environment variables and python version)

• test2.sh (shell script to test for environment variables, gfortran and ipython version)

• test3.f90 (a dummy fortran code)

• Workspace 2: Quadratic, Cubic and Polynomial Interpolation
• hw2a.py (python script for quadratic interpolation)

• hw2b.py (python script for quadratic, cubic and polynomial interpolation)

• Workspace 3: Newton’s Method and Convergence Criteria
• newton.py (ipython module to compute square root of a number using Newton’s method)

• intersections.py (ipython script to plot a graph where two functions intersect - uses newton.py)

• test1.f90, newton.f90, intersections.f90, functions.f90, Makefile (fortran code for the intersection problem using max iterations for convergence)

• problem7/newton.f90, problem7/functions.f90, problem7/test_quartic.f90, problem7/Makefile, problem7/intersections.py (intersection problem using residual error for convergence)

• Workspace 4: OpenMP
• quadrature.f90 (module for integrating a function)

• test1.f90 (cubic function - uses quadrature module)

• test2.f90 (sinusoidal function - uses quadrature module)

• quadrature_omp.f90, test2_omp.f90 (same as above but using OpenMP)

• Makefile (makefile for the 1D solution)

• functions.f90 (sinusoidal function module)

• quadrature.f90, test.f90 (trapezoid rule module + test - serial)

• quadrature2.f90, test2.f90 (simpson’s rule module + test - serial)

• quadrature3.f90, test3.f90 (trapezoid rule module + test - OpenMP)

• Workspace 6: MPI
• Makefile (makefile for this directory)

• copyvalue.f90 (passes a value from Process 0 to Processes 1-3 using MPI)

• copyvalue2.f90 (passes a value from Process 3 to Processes 0-2 using MPI)

• part2/test.f90 (compute the same integral over all Processes using MPI)

• part2/test2.f90 (integral using Master-Worker paradigm)

• Project 1: 1D Integral and MPI
• functions.f90, quadrature.f90, test2.f90, Makefile (compute a 1D integral using MPI by dividing up intervals between an indeterminate number of processes)

• Project 2: 20D Integral and the Monte Carlo Method
• functions.f90, quadrature_mc.f90, random_util.f90, test_quad_mc.f90, Makefile (compute integral using Monte Carlo method for a 20 dimensional integral - serial)

• Project 3: Laplace’s Equation and the Monte Carlo Method
• laplace_mc.py (python version)

• problem_description.f90, laplace_mc.f90, mc_walk.f90, random_util.f90, Makefile (compute the solution to Laplace’s Equation using the Monte Carlo Method - serial)

• test1.f90, test2.f90 (trials I made along the way)

• Project 4: Laplace’s Equation, MPI and the Monte Carlo Method
• problem_description.f90, laplace_mc.f90, mc_walk.f90, random_util.f90, Makefile (compute the solution to Laplace’s Equation using the Monte Carlo Method - using MPI)

• test1.f90, test1b.f90, test2.f90 (trials I made along the way)

## 2.1. Workspace 1: Testing Scripts and Dummy Code¶

• test1.py (python script to test for environment variables and python version)

"""
test1.py file

Note: this is a Python script.  What you are reading now in the triple
quotes is a comment (a "docstring" since it's the first thing in this file).

This file is to test whether you have environment variables set properly and
are able to import numpy (for numerical computation), matplotlib,
and pylab (for plotting).

Instructions:

Make sure your environment variables $UWHPSC and$MYHPSC are set properly.

Insert your name where instructed below.

Type
$python test1.py at the Unix prompt ($ represents the prompt) to run the code.

If the output looks ok, change the file to set debug_mode = False
below and run it again.  This time the results will go into a file
test1output.txt.   Check that these look ok.

Commit the final version of this file along with test1output.txt to your git
repository.

"""

import os,sys

debug_mode = False

if debug_mode:
output_file = sys.stdout  # send output to screen
else:
# Once it's working, send output to a file rather than screen:
output_file = open("test1output.txt","w")
sys.stdout = output_file

print "Code run by **Andrew Roberts**"

UWHPSC = os.environ.get('UWHPSC','** not set **')
print "Environment variable UWHPSC is %s"  % UWHPSC

MYHPSC = os.environ.get('MYHPSC','** not set **')
print "Environment variable MYHPSC is %s"  % MYHPSC

# Check for various Python modules and print an error message if
# an exception occurs:

try:
import numpy
print "Imported numpy ok"
except:
print "** Error importing numpy"

try:
import matplotlib
print "Imported matplotlib ok"
except:
print "** Error importing matplotlib"

try:
import pylab
print "Imported pylab ok"
except:
print "** Error importing pylab"

if not debug_mode:
output_file.close()  # close the text file


• test2.sh (shell script to test for environment variables, gfortran and ipython version)


# bash script to check for software
# Insert your name where indicated below, then type
#    $bash test2.sh # When it looks good, redirect output to a file: #$ bash test2h.sh > test2output.txt

echo
echo Code run by  **Andrew Roberts**
echo Environment variable UWHPSC is $UWHPSC echo Environment variable MYHPSC is$MYHPSC

echo
echo which ipython returns...
which ipython

echo
echo which gfortran returns...
which gfortran

echo
echo gfortran --version returns...
gfortran --version

echo
echo Compiling and running a Fortran code...
gfortran test3.f90
./a.out


• test3.f90 (a dummy fortran code)

! Test fortran program.
! Insert your name where indicated below.
! This code will be compiled and run when you execute the test2.sh script.

program test
print *, "Code run by  **Andrew Roberts**"
print *, "Successfully ran Fortran 90 program"
end program test


## 2.2. Workspace 2: Quadratic, Cubic and Polynomial Interpolation¶

• hw2a.py (python script for quadratic interpolation)

"""
Update this docstring to describe your code.
Modified by: ** your name here **
"""

import numpy as np
import matplotlib.pyplot as plt
from numpy.linalg import solve

# Set up linear system to interpolate through data points:

# Data points:
xi = np.array([-1., 1., 2])
yi = np.array([0., 4., 3.])

# It would be better to define A in terms of the xi points.
# Doing this is part of the homework assignment.
A = np.array([[1., -1., 1.], [1., 1., 1.], [1., 2., 4.]])
b = yi

# Solve the system:
c = solve(A,b)

print "The polynomial coefficients are:"
print c

# Plot the resulting polynomial:
x = np.linspace(-2,3,1001)   # points to evaluate polynomial
y = c[0] + c[1]*x + c[2]*x**2

plt.figure(1)       # open plot figure window
plt.clf()           # clear figure
plt.plot(x,y,'b-')  # connect points with a blue line

# Add data points  (polynomial should go through these points!)
plt.plot(xi,yi,'ro')   # plot as red circles
plt.ylim(-2,8)         # set limits in y for plot

plt.title("Data points and interpolating polynomial")

plt.savefig('demo1plot.png')   # save figure as .png file

• hw2b.py (python script for quadratic, cubic and polynomial interpolation)


"""
Update this docstring to describe your code.
Modified by: ** Andrew Roberts **
"""

import numpy as np
import matplotlib.pyplot as plt
from numpy.linalg import solve

"""
Quadratic interpolation.  Compute the coefficients of the polynomial
interpolating the points (xi[i],yi[i]) for i = 0,1,2.
Returns c, an array containing the coefficients of
p(x) = c[0] + c[1]*x + c[2]*x**2.

"""

# check inputs and print error message if not valid:

error_message = "xi and yi should have type numpy.ndarray"
assert (type(xi) is np.ndarray) and (type(yi) is np.ndarray), error_message

error_message = "xi and yi should have length 3"
assert len(xi)==3 and len(yi)==3, error_message

# Set up linear system to interpolate through data points:

A = np.vstack([np.ones(3), xi, xi**2]).T
b = yi
c = solve(A,b)

return c

def cubic_interp(xi,yi):
"""
Cubic interpolation.  Compute the coefficients of the polynomial
interpolating the points (xi[i],yi[i]) for i = 0,1,2,3.
Returns c, an array containing the coefficients of
p(x) = c[0] + c[1]*x + c[2]*x**2 +c[3]x**3.

"""

# check inputs and print error message if not valid:

error_message = "xi and yi should have type numpy.ndarray"
assert (type(xi) is np.ndarray) and (type(yi) is np.ndarray), error_message

error_message = "xi and yi should have length 4"
assert len(xi)==4 and len(yi)==4, error_message

# Set up linear system to interpolate through data points:

A = np.vstack([np.ones(4), xi, xi**2, xi**3]).T
b = yi
c = solve(A,b)

return c

def poly_interp(xi,yi):

"""Polynomial interpolation.  Compute the coefficients of the polynomial
interpolating the points (xi[i],yi[i]) for i = 0,1,2... n
Returns c, an array containing the coefficients of
p(x) = c[0] + c[1]*x + c[2]*x**2 + c[3]x**3 +...+ c[n]x**n.
"""

# check inputs and print error message if not valid:

error_message = "xi and yi should have type numpy.ndarray"
assert (type(xi) is np.ndarray) and (type(yi) is np.ndarray), error_message

error_message = "xi and yi should have the same length"
assert len(xi)== len(yi), error_message

# Set up linear system to interpolate through data points:

#stacks them vertically and then takes transpose
#A = np.empty([len(xi),len(xi)])
#A[:,0] = 1.
#for i in range(0,len(xi),1):
#    for j in range(1,len(xi),1):
#        A[i,j]=xi[i]**(j)

# Uses a list comprehension:
n = len(xi)
A = np.vstack([xi**j for j in range(n)]).T
b = yi
c = solve(A,b)

return c

x = np.linspace(xi.min()-1, xi.max() + 1, 1000)
y = c[0]+c[1]*x + c[2]*x**2
plt.figure(1)
plt.clf()
plt.plot(x,y,'b-')
plt.plot(xi,yi,'ro')
plt.xlabel('x (-)')
plt.ylabel('y (-)')
plt.title("Data points and interpolated polynomial")

def plot_cubic(xi,yi):
c = cubic_interp(xi,yi)
x = np.linspace(xi.min()-1, xi.max() + 1, 1000)
y = c[0]+c[1]*x + c[2]*x**2 + c[3]*x**3
plt.figure(1)
plt.clf()
plt.plot(x,y,'b-')
plt.plot(xi,yi,'ro')
plt.xlabel('x (-)')
plt.ylabel('y (-)')
plt.title("Data points and interpolated polynomial")
plt.savefig('cubic.png')

def plot_poly(xi,yi):
c = poly_interp(xi,yi)
x = np.linspace(xi.min()-1, xi.max() + 1, 1000)
n = len(c)
y = c[n-1]
for j in range(n-1, 0, -1):
y = y*x + c[j-1]

plt.figure(1)
plt.clf()
plt.plot(x,y,'b-')
plt.plot(xi,yi,'ro')
plt.xlabel('x (-)')
plt.ylabel('y (-)')
plt.title("Data points and interpolated polynomial")
plt.savefig('poly.png')

"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-1.,  0.,  2.])
yi = np.array([ 1., -1.,  7.])
c_true = np.array([-1.,  0.,  2.])
print "c =      ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)

"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-10.,  1.,  20.])
yi = np.array([ 15., -50.,  70.])
c_true = np.array([-48.16586922,  -2.24162679,   0.40749601])
print "c =      ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)

def test_cubic1():
"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-1., 1., 2., 3.])
yi = np.array([5., 2., 3., 5])
c = cubic_interp(xi,yi)
c_true = np.array([ 2.5, -1.41666667, 1., -0.08333333])
print "c =      ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)

def test_poly1():
"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-1., 1., 2., 3.])
yi = np.array([5., 2., 3., 5])
c = cubic_interp(xi,yi)
c_true = np.array([ 2.5, -1.41666667, 1., -0.08333333])
print "c =      ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)

def test_poly2():
"""
Test code, no return value or exception if test runs properly.
"""
xi = np.array([-1., 0., 2., 5., 8.])
yi = np.array([ 1., -1., 7., 15., 25.])
c = poly_interp(xi,yi)
c_true = np.array([-1., 1.22777778, 2.51944444, -0.66111111, 0.04722222])
print "c =      ", c
print "c_true = ", c_true
# test that all elements have small error:
assert np.allclose(c, c_true), \
"Incorrect result, c = %s, Expected: c = %s" % (c,c_true)

if __name__=="__main__":
# "main program"
# the code below is executed only if the module is executed at the command line,
#    $python demo2.py # or run from within Python, e.g. in IPython with # In[ ]: run demo2 # not if the module is imported. print "Running test..." test_quad1()  ## 2.3. Workspace 3: Newton’s Method and Convergence Criteria¶ • newton.py (ipython module to compute square root of a number using Newton’s method) """ Module to compute the square root of a number using Newton's method: f(x) = x**2 - n = 0 Parameters: max_iter :- maximum number of iterations for iterative method tol :- relative residual tolerance for iterative method """ import numpy as np max_iter = 20 tol = 1.0e-14 def solve(x0, debug=False): """ Compute the solution to the function using the initial guess & allow debugging Inputs: fvals_sqrt :- function returning tuple containing values of f(x) and f'(x) x0 :- initial guess of the square root debug :- prints x0 outside loop and x within loop, default = False Outputs: x :- computed value of the square root iters :- the number of iterations taken Example usage: import newton; solve(4., True) """ iters = 0 x = x0 if(debug): print "Initial guess: x = %20.15e" %x0 for i in range(max_iter): (fx,dfx) = fvals_sqrt(x) if(abs(fx)<tol): break x = x - (fx/dfx) iters = iters + 1 if(debug): print "At iteration %d x = %20.15e" %(iters,x) return (x, iters) def fvals_sqrt(x): """ Return a tuple containing the value of the function & it's derivative at x Inputs: x :- value for which f(x) and f'(x) are to be returned Outputs: f :- the value of the function, f(x) fp :- the derivative of the function, f'(x) Example usage: (f, fp) = fvals_sqrt(10) """ #f = x**2 - 4.0 #fp = 2.0*x f = x*np.cos(np.pi*x)-(1-(0.6*x**2)) fp = (-x*np.pi)*np.sin(np.pi*x)+np.cos(np.pi*x)+(1.2*x) return (f, fp) def test1(debug_solve=False): """ Test Newton's method using an array of initial gusses Inputs: None Outputs: Print x, iters, and f(x) to the screen Example usage: import newton; newton.test1() """ #for x in [1., 2., 100.]: for x in [-2.2, -1.6, -0.8, 1.4]: print "" (xout, iters) = solve(x, debug=debug_solve) print "After %d iterations, the solver returns x = %20.15e" %(iters,xout) (f, fp) = fvals_sqrt(xout) print "The function f(x) = %20.15e" %f #assert abs(xout-2.0) < tol, "***Unexpected result: x = %22.15e" %xout  • intersections.py (ipython script to plot a graph where two functions intersect - uses newton.py) """ Script to plot g1(x) and g2(x) versus x and to show where they intersect with black dots Example use: >>> run intersections """ import newton import matplotlib.pyplot as plt import numpy as np guesses = np.array([-2.2, -1.6, -0.8, 1.4]) xarray = np.zeros(4) yarray = np.zeros(4) i = 0 for xin in guesses: (xout, iters) = newton.solve(xin) xarray[i] = xout yarray[i] = (1-(0.6*xout**2)) i = i+1 x = np.linspace(-5,5,1000) g1 = x*np.cos(np.pi*x) g2 = (1-(0.6*x**2)) plt.figure(1) plt.clf() plt.xlim(-5,5) plt.xlabel('x (-)') plt.ylabel('y (-)') plt.title("Two functions and their intersecting points") #p1 = plt.plot(x, g1, 'b', label=r'$y(x) = xcos(\pix)$') p1 = plt.plot(x, g1, 'b', label=r'$y(x) = xcos(\pi x)$') p2 = plt.plot(x, g2, 'r', label=r'$y(x) = 1-(0.6x^2)$') p3 = plt.plot(xarray, yarray, 'ko', label=r'$intersection$') plt.legend(loc="best") #plt.legend([p1, p2, p3],["g1(x) = xcos(pi*x)", "g2(x) = 1-(0.6x**2)", "intersection"]) plt.show() #show the plot on the screen plt.savefig('intersection.png') # save figure as .png file  • test1.f90, newton.f90, intersections.f90, functions.f90, Makefile (fortran code for the intersection problem using max iterations for convergence) !$MYHPSC/homework3/test1.f90

program test1

use newton, only: solve
use functions, only: f_sqrt, fprime_sqrt

implicit none
real(kind=8) :: x, x0, fx
real(kind=8) :: x0vals(3)
integer :: iters, i
logical :: debug         ! set to .true. or .false.

print *, "Test routine for computing zero of f"
debug = .true.

! values to test as x0:
x0vals = (/1.d0, 2.d0, 100.d0 /)

do i=1,3
x0 = x0vals(i)
print *, ' '  ! blank line
call solve(f_sqrt, fprime_sqrt, x0, x, iters, debug)

print 11, x, iters
11      format('solver returns x = ', es22.15, ' after', i3, ' iterations')

fx = f_sqrt(x)
print 12, fx
12      format('the value of f(x) is ', es22.15)

if (abs(x-2.d0) > 1d-14) then
print 13, x
13          format('*** Unexpected result: x = ', es22.15)
endif
enddo

end program test1

! $MYHPSC/homework3/newton.f90 module newton ! module parameters: implicit none integer, parameter :: maxiter = 20 real(kind=8), parameter :: tol = 1.d-14 contains subroutine solve(f, fp, x0, x, iters, debug) ! Estimate the zero of f(x) using Newton's method. ! Input: ! f: the function to find a root of ! fp: function returning the derivative f' ! x0: the initial guess ! debug: logical, prints iterations if debug=.true. ! Returns: ! the estimate x satisfying f(x)=0 (assumes Newton converged!) ! the number of iterations iters implicit none real(kind=8), intent(in) :: x0 real(kind=8), external :: f, fp logical, intent(in) :: debug real(kind=8), intent(out) :: x integer, intent(out) :: iters ! Declare any local variables: real(kind=8) :: deltax, fx, fxprime integer :: k ! initial guess x = x0 if (debug) then print 11, x 11 format('Initial guess: x = ', es22.15) endif ! Newton iteration to find a zero of f(x) do k=1,maxiter ! evaluate function and its derivative: fx = f(x) fxprime = fp(x) if (abs(fx) < tol) then exit ! jump out of do loop endif ! compute Newton increment x: deltax = fx/fxprime ! update x: x = x - deltax if (debug) then print 12, k,x 12 format('After', i3, ' iterations, x = ', es22.15) endif enddo if (k > maxiter) then ! might not have converged fx = f(x) if (abs(fx) > tol) then print *, '*** Warning: has not yet converged' endif endif ! number of iterations taken: iters = k-1 end subroutine solve end module newton  !$MYHPSC/homework3/intersections.f90

program intersections

use newton, only: solve
use functions, only: f_function, fprime_function

implicit none
real(kind=8) :: x, x0, fx
real(kind=8) :: x0vals(4)
integer :: iters, i
logical :: debug         ! set to .true. or .false.

print *, "Test routine for computing zero of f"
debug = .true.

! values to test as x0:
x0vals = (/-2.2d0, -1.6d0, -0.8d0, 1.4d0 /)

do i=1,4
x0 = x0vals(i)
print *, ' '  ! blank line
call solve(f_function, fprime_function, x0, x, iters, debug)

print 11, x, iters
11      format('solver returns x = ', es22.15, ' after', i3, ' iterations')

fx = f_function(x)
print 12, fx
12      format('the value of f(x) is ', es22.15)

!        if (abs(x-2.d0) > 1d-14) then
!            print 13, x
!13          format('*** Unexpected result: x = ', es22.15)
!            endif
enddo

end program intersections

! $MYHPSC/homework3/functions.f90 module functions ! parameters for module ! implicit none ! real(kind=8) :: pi ! save contains real(kind=8) function f_sqrt(x) implicit none real(kind=8), intent(in) :: x f_sqrt = x**2 - 4.d0 end function f_sqrt real(kind=8) function fprime_sqrt(x) implicit none real(kind=8), intent(in) :: x fprime_sqrt = 2.d0 * x end function fprime_sqrt real(kind=8) function f_function(x) implicit none real(kind=8), intent(in) :: x real(kind=8) :: pi = 3.141592653589793 f_function = (x*cos(pi*x))-(1.0d0-(0.6d0*(x**2))) end function f_function real(kind=8) function fprime_function(x) implicit none real(kind=8), intent(in) :: x real(kind=8) :: pi = 3.141592653589793 fprime_function = ((-x*pi)*sin(pi*x))+cos(pi*x)+(1.2d0*x) end function fprime_function end module functions  #$MYHPSC/homework3/Makefile
# Example usage:
#	$make test1 #$ make clean
#	$make intersections OBJECTS = functions.o newton.o test1.o OBJECTS2 = functions.o newton.o intersections.o MODULES = functions.mod newton.mod FFLAGS = -g .PHONY: test1 clean intersections test1: test1.exe ./test1.exe intersections: intersections.exe ./intersections.exe test1.exe:$(MODULES) $(OBJECTS) gfortran$(FFLAGS) $(OBJECTS) -o test1.exe intersections.exe:$(MODULES) $(OBJECTS2) gfortran$(FFLAGS) $(OBJECTS2) -o intersections.exe %.o : %.f90 gfortran$(FFLAGS) -c  $< %.mod: %.f90 gfortran$(FFLAGS) -c $< clean: rm -f *.o *.exe *.mod  • problem7/newton.f90, problem7/functions.f90, problem7/test_quartic.f90, problem7/Makefile, problem7/intersections.py (intersection problem using residual error for convergence) !$MYHPSC/homework3/problem7/newton.f90

module newton

! module parameters:
implicit none
integer, parameter :: maxiter = 40
real(kind=8) :: tol
save

contains

subroutine solve(f, fp, x0, x, iters, debug)

! Estimate the zero of f(x) using Newton's method.
! Input:
!   f:  the function to find a root of
!   fp: function returning the derivative f'
!   x0: the initial guess
!   debug: logical, prints iterations if debug=.true.
! Returns:
!   the estimate x satisfying f(x)=0 (assumes Newton converged!)
!   the number of iterations iters

implicit none
real(kind=8), intent(in) :: x0
real(kind=8), external :: f, fp
logical, intent(in) :: debug
real(kind=8), intent(out) :: x
integer, intent(out) :: iters

! Declare any local variables:
real(kind=8) :: deltax, fx, fxprime
integer :: k

! initial guess
x = x0

if (debug) then
print 11, x
11     format('Starting with initial guess: x = ', es22.15)
endif

! Newton iteration to find a zero of f(x)

do k=1,maxiter

! evaluate function and its derivative:
fx = f(x)
fxprime = fp(x)

if (abs(fx/fxprime) < tol) then
exit  ! jump out of do loop
endif

! compute Newton increment x:
deltax = fx/fxprime

! update x:
x = x - deltax

if (debug) then
print 12, k,x
12         format('After', i3, ' iterations, x = ', es22.15)
endif

enddo

if (k > maxiter) then
! might not have converged

fx = f(x)
if (abs(fx) > tol) then
print *, '*** Warning: has not yet converged'
endif
endif

! number of iterations taken:
iters = k-1

end subroutine solve

end module newton

! $MYHPSC/homework3/problem7/functions.f90 module functions ! parameters for module implicit none real(kind=8) :: alpha save contains real(kind=8) function f_sqrt(x) implicit none real(kind=8), intent(in) :: x f_sqrt = x**2 - 4.d0 end function f_sqrt real(kind=8) function fprime_sqrt(x) implicit none real(kind=8), intent(in) :: x fprime_sqrt = 2.d0 * x end function fprime_sqrt real(kind=8) function f_function(x) implicit none real(kind=8), intent(in) :: x real(kind=8) :: pi = 3.141592653589793 f_function = (x*cos(pi*x))-(1.0d0-(0.6d0*(x**2))) end function f_function real(kind=8) function fprime_function(x) implicit none real(kind=8), intent(in) :: x real(kind=8) :: pi = 3.141592653589793 fprime_function = ((-x*pi)*sin(pi*x))+cos(pi*x)+(1.2d0*x) end function fprime_function real(kind=8) function f_quartic(x) implicit none real(kind=8), intent(in) :: x f_quartic = (x-1)**4 - alpha end function f_quartic real(kind=8) function fprime_quartic(x) implicit none real(kind=8), intent(in) :: x fprime_quartic = 4*(x-1)**3 end function fprime_quartic end module functions  !$MYHPSC/homework3/problem7/test_quartic.f90

program test_quartic

use newton, only: solve, tol
use functions, only: f_quartic, fprime_quartic, alpha

implicit none
real(kind=8) :: x, x0, fx, dfx, xstar
real(kind=8) :: tolvals(3), alphavals(3)
integer :: iters, i, j
logical :: debug         ! set to .true. or .false.

! values to test as x0:
x0 = 4.0d0

print 10, x0
10	format("Starting with initial guess x = ", es22.15)

debug = .false.

! values to use as tolerance:
tolvals = (/1.0d-5, 1.0d-10, 1.0d-14/)

! values to use as alpha:
alphavals = (/1.0d-4, 1.0d-8, 1.0d-12/)

print *, ' '  ! blank line
print *, '   alpha        tol        iters          x              f(x)/f(x)   x-xstar'

do i=1,3
do j =1,3
tol = tolvals(j)
alpha = alphavals(i)
xstar = 1+(alpha**0.25)
call solve(f_quartic, fprime_quartic, x0, x, iters, debug)
fx = f_quartic(x)
dfx = fprime_quartic(x)
print 11, alpha, tol, iters, x, fx/dfx, x-xstar
11 			format(2es13.3, i4, es24.15, 2es13.3)

enddo
print *, ' '  ! blank line
enddo

end program test_quartic

# $MYHPSC/homework3/problem7/Makefile # Example usage: #$ make clean
#	$make test_quartic OBJECTS = functions.o newton.o test_quartic.o MODULES = functions.mod newton.mod FFLAGS = -g .PHONY: clean test_quartic test_quartic: test_quartic.exe ./test_quartic.exe test_quartic.exe:$(MODULES) $(OBJECTS) gfortran$(FFLAGS) $(OBJECTS) -o test_quartic.exe %.o : %.f90 gfortran$(FFLAGS) -c  $< %.mod: %.f90 gfortran$(FFLAGS) -c $< clean: rm -f *.o *.exe *.mod  """ Script to plot g1(x) and g2(x) versus x and to show where they intersect with black dots Example use: >>> run intersections """ import newton import matplotlib.pyplot as plt import numpy as np guesses = np.array([-4,4]) xarray = np.zeros(2) yarray = np.zeros(2) i = 0 for xin in guesses: (xout, iters) = newton.solve(xin) xarray[i] = xout yarray[i] = ((xout-1.0)**4.0)-1.0e-4 i = i+1 x = np.linspace(-5,5,1000) g1 = ((x-1.0)**4.0)-1.0e-4 g2 = 0*x plt.figure(1) plt.clf() plt.ylim(-5e-4,5e-4) plt.xlim(0,2) plt.xlabel('x (-)') plt.ylabel('y (-)') plt.title("Two functions and their intersecting points") #p1 = plt.plot(x, g1, 'b', label=r'$y(x) = xcos(\pix)$') p1 = plt.plot(x, g1, 'b', label=r'$y(x) = (x - 1)^4 - 1\times 10^{-4}$') p2 = plt.plot(x, g2, 'r', label=r'$y(x) = 0$') p3 = plt.plot(xarray, yarray, 'ko', label=r'$intersection$') plt.legend(loc="best") plt.show() #show the plot on the screen plt.savefig('intersection.png') # save figure as .png file  ## 2.4. Workspace 4: OpenMP¶ • quadrature.f90 (module for integrating a function) module quadrature ! Performs integration using quadrature integration ! and creates a table of the error between this and ! the known solution. contains real(kind=8) function trapezoid(f,a,b,n) implicit none real(kind=8), external :: f real(kind=8), intent(in) :: a, b integer, intent(in) :: n real(kind=8) :: h integer :: i real(kind=8), dimension(n) :: xpoints, ypoints h=(b-a)/(n-1) ! print *, " " ! blank line ! print 12, h 12 format("h: ", es22.14) xpoints=(/((i*h+a),i=0,n-1)/) ypoints=(/(f(xpoints(i)),i=1,n)/) trapezoid = h*sum(ypoints)-0.5*h*(ypoints(1)+ypoints(n)) ! do i=1,5 ! print 13, xpoints(i), ypoints(i) !13 format("x,y = " es22.14, es22.14) ! enddo end function trapezoid subroutine error_table(f,a,b,nvals,int_true) implicit none real(kind=8), external :: f real(kind=8), intent(in) :: a, b, int_true integer, dimension(:), intent(in) :: nvals integer :: n real(kind=8) :: last_error, int_trap, error, ratio print *, " n trapezoid error ratio" last_error = 0.d0 do n=1,size(nvals) int_trap = trapezoid(f, a, b, nvals(n)) error = abs(int_trap - int_true) ratio = last_error / error last_error = error print 11, nvals(n), int_trap, error, ratio 11 format(i8, es22.14, es13.3, es13.3) enddo end subroutine error_table end module quadrature  • test1.f90 (cubic function - uses quadrature module) !$MYHPSC/homework4/test1.f90
! The purpose is to print a table for the effect of the number
! of integration points on the accuracy of the integration

! Example use:
! $gfortran quadrature.f90 test1.f90 !$ ./a.out

program test1

! To provide input values to print error_table to the screen

implicit none
real(kind=8) :: a,b,int_true
integer :: nvals(7), i, n

a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0

!    print 10, int_true
! 10 format("true integral: ", es22.14)
!    print *, " "  ! blank line

!	n=5

!	print 11, trapezoid(f,a,b,n)
!11  format("calculated integral: ", es22.14)
!   values of n to test:
do i=1,7
nvals(i) = 5 * 2**(i-1)
enddo

call error_table(f, a, b, nvals, int_true)

contains

! To return the the value of the function to integrate, f

real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x

f = 1.d0 + x**3
end function f

end program test1

• test2.f90 (sinusoidal function - uses quadrature module)

! $MYHPSC/homework4/test2.f90 ! The purpose is to print a table for the effect of the number ! of integration points on the accuracy of the integration ! Example use: !$ gfortran quadrature.f90 test2.f90
! $./a.out program test2 ! To provide input values to print error_table to the screen use quadrature, only: trapezoid, error_table implicit none real(kind=8) :: a,b,int_true,k integer :: nvals(12), i, n a = 0.d0 b = 2.d0 k = 1000.d0 int_true = (b-a) + (b**4 - a**4) / 4.d0 - & ((cos(k*b) - cos(k*a)) / k) ! values of n to test: do i=1,12 nvals(i) = 5 * 2**(i-1) enddo call error_table(f, a, b, nvals, int_true) contains ! To return the the value of the function to integrate, f real(kind=8) function f(x) implicit none real(kind=8), intent(in) :: x f = 1.d0 + x**3 + sin(1000*x) end function f end program test2  • quadrature_omp.f90, test2_omp.f90 (same as above but using OpenMP) module quadrature_omp ! Performs integration using quadrature integration ! and creates a table of the error between this and ! the known solution. contains real(kind=8) function trapezoid(f,a,b,n) use omp_lib implicit none real(kind=8), external :: f real(kind=8), intent(in) :: a, b integer, intent(in) :: n real(kind=8) :: h integer :: i real(kind=8), dimension(n) :: xpoints, ypoints h=(b-a)/(n-1) !$omp parallel private(i)

!$omp do reduction(+ : trapezoid) do i=1,n xpoints(i)=(i-1)*h+a ypoints(i)=f(xpoints(i)) enddo trapezoid = h*sum(ypoints)-0.5*h*(ypoints(1)+ypoints(n)) !$omp end parallel

end function trapezoid

subroutine error_table(f,a,b,nvals,int_true)
implicit none
real(kind=8), external :: f
real(kind=8), intent(in) :: a, b, int_true
integer, dimension(:), intent(in) :: nvals
integer :: n
real(kind=8) :: last_error, int_trap, error, ratio

print *, "      n  trapezoid               error        ratio"
last_error = 0.d0
do n=1,size(nvals)
int_trap = trapezoid(f, a, b, nvals(n))
error = abs(int_trap - int_true)
ratio = last_error / error
last_error = error
print 11, nvals(n), int_trap, error, ratio
11		format(i8, es22.14, es13.3, es13.3)
enddo

end subroutine error_table


! $MYHPSC/homework4/test2_omp.f90 ! The purpose is to print a table for the effect of the number ! of integration points on the accuracy of the integration ! Example use: !$ gfortran -fopenmp quadrature_omp.f90 test2_omp.f90
! $./a.out program test2_omp ! To provide input values to print error_table to the screen use omp_lib use quadrature_omp, only: trapezoid, error_table implicit none real(kind=8) :: a,b,int_true,k integer :: nvals(20), i, n, nthreads real(kind=8) :: t1, t2, elapsed_time integer(kind=8) :: tclock1, tclock2, clock_rate a = 0.d0 b = 2.d0 k = 1000.d0 int_true = (b-a) + (b**4 - a**4) / 4.d0 - & ((cos(k*b) - cos(k*a)) / k) ! Specify number of threads to use: nthreads = 1 ! need this value in serial mode !$ nthreads = 2
!$call omp_set_num_threads(nthreads) !$ print "('Using OpenMP with ',i3,' threads')", nthreads

! Specify number of threads to use:
!$call omp_set_num_threads(2) ! values of n to test: do i=1,20 nvals(i) = 5 * 2**(i-1) enddo call system_clock(tclock1) ! start wall timer call cpu_time(t1) ! start cpu timer call error_table(f, a, b, nvals, int_true) call cpu_time(t2) ! end cpu timer print 10, t2-t1 10 format("CPU time = ",es13.3, " seconds") call system_clock(tclock2, clock_rate) elapsed_time = float(tclock2 - tclock1) / float(clock_rate) print 11, elapsed_time 11 format("Elapsed time = ",es13.3, " seconds") contains ! To return the the value of the function to integrate, f real(kind=8) function f(x) implicit none real(kind=8), intent(in) :: x f = 1.d0 + x**3 + sin(1000*x) end function f end program test2_omp  ## 2.5. Workspace 5: Load Balancing¶ • Makefile (makefile for the 1D solution)  OBJECTS = functions.o quadrature.o test.o OBJECTS2 = functions.o quadrature2.o test2.o OBJECTS3 = functions.o quadrature3.o test3.o FFLAGS = -fopenmp .PHONY: test test2 test3 clean test: test.exe ./test.exe test2: test2.exe ./test2.exe test3: test3.exe ./test3.exe test.exe:$(OBJECTS)
gfortran $(FFLAGS)$(OBJECTS) -o test.exe

test2.exe: $(OBJECTS2) gfortran$(FFLAGS) $(OBJECTS2) -o test2.exe test3.exe:$(OBJECTS3)
gfortran $(FFLAGS)$(OBJECTS3) -o test3.exe

%.o : %.f90
gfortran $(FFLAGS) -c$<

clean:
rm -f *.o *.exe *.mod


• functions.f90 (sinusoial function module)

functions.f90

module functions

use omp_lib
implicit none
integer :: fevals(0:7)
real(kind=8) :: k
save

contains

real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x

! keep track of number of function evaluations by
thread_num = 0   ! serial mode
!$thread_num = omp_get_thread_num() fevals(thread_num) = fevals(thread_num) + 1 f = 1.d0 + x**3 + sin(k*x) end function f end module functions  • quadrature.f90, test.f90 (trapezoid rule module + test - serial) module quadrature use omp_lib contains real(kind=8) function trapezoid(f, a, b, n) ! Estimate the integral of f(x) from a to b using the ! Trapezoid Rule with n points. ! Input: ! f: the function to integrate ! a: left endpoint ! b: right endpoint ! n: number of points to use ! Returns: ! the estimate of the integral implicit none real(kind=8), intent(in) :: a,b real(kind=8), external :: f integer, intent(in) :: n ! Local variables: integer :: j real(kind=8) :: h, trap_sum, xj h = (b-a)/(n-1) trap_sum = 0.5d0*(f(a) + f(b)) ! endpoint contributions !$omp parallel do private(xj) reduction(+ : trap_sum)
do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(xj)
enddo

trapezoid = h * trap_sum

end function trapezoid

subroutine error_table(f,a,b,nvals,int_true,method)

! Compute and print out a table of errors when the quadrature
! rule specified by the input function method is applied for
! each value of n in the array nvals.

implicit none
real(kind=8), intent(in) :: a,b, int_true
real(kind=8), external :: f, method
integer, dimension(:), intent(in) :: nvals

! Local variables:
integer :: j, n
real(kind=8) :: ratio, last_error, error, int_approx

print *, "      n         approximation        error       ratio"
last_error = 0.d0
do j=1,size(nvals)
n = nvals(j)
int_approx = method(f,a,b,n)
error = abs(int_approx - int_true)
ratio = last_error / error
last_error = error  ! for next n

print 11, n, int_approx, error, ratio
11     format(i8, es22.14, es13.3, es13.3)
enddo

end subroutine error_table


program test

use omp_lib

use functions, only: f, fevals, k

implicit none
real(kind=8) :: a,b,int_true
integer :: nvals(12)

real(kind=8) :: t1, t2, elapsed_time
integer(kind=8) :: tclock1, tclock2, clock_rate

nthreads = 1      ! for serial mode
!$nthreads = 4 ! for openmp !$ call omp_set_num_threads(nthreads)

fevals = 0

k = 1.d3   ! functions module variable for function f2
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))

print 10, int_true
10 format("true integral: ", es22.14)
print *, " "  ! blank line

! values of n to test:   (larger values than before)
do i=1,12
nvals(i) = 50 * 2**(i-1)
enddo

! time the call to error_table:
call system_clock(tclock1)
call cpu_time(t1)
call error_table(f, a, b, nvals, int_true, trapezoid)
call cpu_time(t2)
call system_clock(tclock2, clock_rate)

elapsed_time = float(tclock2 - tclock1) / float(clock_rate)
print *, " "
print 11, elapsed_time
11 format("Elapsed time = ",f12.8, " seconds")

print 12, t2-t1
12 format("CPU time = ",f12.8, " seconds")

! print the number of function evaluations by each thread:
print 101,  i, fevals(i)
101     format("fevals by thread ",i2,": ",i13)
enddo

print 102, sum(fevals)
102 format("Total number of fevals: ",i10)

end program test

• quadrature2.f90, test2.f90 (simpson’s rule module + test - serial)

module quadrature2

use omp_lib

contains

real(kind=8) function simpson(f, a, b, n)

! Estimate the integral of f(x) from a to b using the
! Simpson Rule with n points.

! Input:
!   f:  the function to integrate
!   a:  left endpoint
!   b:  right endpoint
!   n:  number of points to use
! Returns:
!   the estimate of the integral

implicit none
real(kind=8), intent(in) :: a,b
real(kind=8), external :: f
integer, intent(in) :: n

! Local variables:
integer :: j
real(kind=8) :: h, simp_sum, xj, xj2

h = (b-a)/(n-1)
simp_sum = (1.d0/6.d0)*(f(a) + f(b))  ! endpoint contributions
simp_sum = simp_sum + (2.d0/3.d0)*f(a+0.5d0*h) !for the half before the loop
!$omp parallel do private(xj,xj2) reduction(+ : simp_sum) do j=2,n-1 xj = a + (j-1.d0)*h xj2 = a + (j-0.5d0)*h simp_sum = simp_sum + (1.d0/3.d0)*f(xj) + (2.d0/3.d0)*f(xj2) enddo simpson = h * simp_sum end function simpson subroutine error_table(f,a,b,nvals,int_true,method) ! Compute and print out a table of errors when the quadrature ! rule specified by the input function method is applied for ! each value of n in the array nvals. implicit none real(kind=8), intent(in) :: a,b, int_true real(kind=8), external :: f, method integer, dimension(:), intent(in) :: nvals ! Local variables: integer :: j, n real(kind=8) :: ratio, last_error, error, int_approx print *, " n approximation error ratio" last_error = 0.d0 do j=1,size(nvals) n = nvals(j) int_approx = method(f,a,b,n) error = abs(int_approx - int_true) ratio = last_error / error last_error = error ! for next n print 11, n, int_approx, error, ratio 11 format(i8, es22.14, es13.3, es13.3) enddo end subroutine error_table end module quadrature2  program test2 use omp_lib use quadrature2, only: simpson, error_table use functions, only: f, fevals, k implicit none real(kind=8) :: a,b,int_true integer :: nvals(12) integer :: i, nthreads real(kind=8) :: t1, t2, elapsed_time integer(kind=8) :: tclock1, tclock2, clock_rate nthreads = 1 ! for serial mode !$ nthreads = 4   ! for openmp
!$call omp_set_num_threads(nthreads) print 100, nthreads 100 format("Using ",i2," threads") fevals = 0 k = 1.d3 ! functions module variable for function f2 a = 0.d0 b = 2.d0 int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a)) print 10, int_true 10 format("true integral: ", es22.14) print *, " " ! blank line ! values of n to test: (larger values than before) do i=1,12 nvals(i) = 50 * 2**(i-1) enddo ! time the call to error_table: call system_clock(tclock1) call cpu_time(t1) call error_table(f, a, b, nvals, int_true, simpson) call cpu_time(t2) call system_clock(tclock2, clock_rate) elapsed_time = float(tclock2 - tclock1) / float(clock_rate) print *, " " print 11, elapsed_time 11 format("Elapsed time = ",f12.8, " seconds") print 12, t2-t1 12 format("CPU time = ",f12.8, " seconds") ! print the number of function evaluations by each thread: do i=0,nthreads-1 print 101, i, fevals(i) 101 format("fevals by thread ",i2,": ",i13) enddo print 102, sum(fevals) 102 format("Total number of fevals: ",i10) end program test2  • quadrature3.f90, test3.f90 (trapezoid rule module + test - OpenMP) module quadrature3 use omp_lib contains real(kind=8) function trapezoid(f, a, b, n) ! Estimate the integral of f(x) from a to b using the ! Trapezoid Rule with n points. ! Input: ! f: the function to integrate ! a: left endpoint ! b: right endpoint ! n: number of points to use ! Returns: ! the estimate of the integral implicit none real(kind=8), intent(in) :: a,b real(kind=8), external :: f integer, intent(in) :: n ! Local variables: integer :: j real(kind=8) :: h, trap_sum, xj h = (b-a)/(n-1) trap_sum = 0.5d0*(f(a) + f(b)) ! endpoint contributions do j=2,n-1 xj = a + (j-1)*h trap_sum = trap_sum + f(xj) enddo trapezoid = h * trap_sum end function trapezoid subroutine error_table(f,a,b,nvals,int_true,method) ! Compute and print out a table of errors when the quadrature ! rule specified by the input function method is applied for ! each value of n in the array nvals. implicit none real(kind=8), intent(in) :: a,b, int_true real(kind=8), external :: f, method integer, dimension(:), intent(in) :: nvals ! Local variables: integer :: j, n real(kind=8) :: ratio, last_error, error, int_approx integer thread_num print *, " n approximation error ratio thread no." last_error = 0.d0 !$omp parallel do firstprivate(last_error) private(n, int_approx, error, ratio) &
!$omp schedule(dynamic) do j=size(nvals),1,-1 n = nvals(j) int_approx = method(f,a,b,n) error = abs(int_approx - int_true) ratio = last_error / error last_error = error ! for next n -- last_error is firstprivate variable - ! each thead must start with last_error as zero - so ratio will be wrong. thread_num = 0 ! serial mode !$ thread_num = omp_get_thread_num()
print 11, n, int_approx, error, ratio, thread_num
11     format(i8, es22.14, es13.3, es13.3, i8)
enddo

end subroutine error_table


program test3

use omp_lib

use functions, only: f, fevals, k

implicit none
real(kind=8) :: a,b,int_true
integer :: nvals(12)

real(kind=8) :: t1, t2, elapsed_time
integer(kind=8) :: tclock1, tclock2, clock_rate

nthreads = 1      ! for serial mode
!$nthreads = 4 ! for openmp !$ call omp_set_num_threads(nthreads)

fevals = 0

k = 1.d3   ! functions module variable for function f2
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))

print 10, int_true
10 format("true integral: ", es22.14)
print *, " "  ! blank line

! values of n to test:   (larger values than before)
do i=1,12
nvals(i) = 50 * 2**(i-1)
enddo

! time the call to error_table:
call system_clock(tclock1)
call cpu_time(t1)
call error_table(f, a, b, nvals, int_true, trapezoid)
call cpu_time(t2)
call system_clock(tclock2, clock_rate)

elapsed_time = float(tclock2 - tclock1) / float(clock_rate)
print *, " "
print 11, elapsed_time
11 format("Elapsed time = ",f12.8, " seconds")

print 12, t2-t1
12 format("CPU time = ",f12.8, " seconds")

! print the number of function evaluations by each thread:
print 101,  i, fevals(i)
101     format("fevals by thread ",i2,": ",i13)
enddo

print 102, sum(fevals)
102 format("Total number of fevals: ",i10)

end program test3


module functions

use omp_lib
implicit none
integer :: fevals(0:7), gevals(0:7)
real(kind=8) :: k
save

contains

real(kind=8) function f(g, x, c, d)
implicit none
real(kind=8), intent(in) :: c,d
real(kind=8), intent(in) :: x
real(kind=8), external :: g

real(kind=8) :: h, trap_sum, y

! keep track of number of function evaluations by
thread_num = 0   ! serial mode
!$thread_num = omp_get_thread_num() fevals(thread_num) = fevals(thread_num) + 1 n = 1000 h = (d-c)/(n-1) trap_sum = 0.5d0*(g(x,c) + g(x,d)) ! endpoint contributions ! !$omp parallel do private(y) reduction(+ : trap_sum)
do j=2,n-1
y = c + (j-1)*h
trap_sum = trap_sum + g(x,y)
enddo

f = h * trap_sum

end function f

real(kind=8) function g(x,y)
implicit none
real(kind=8), intent(in) :: x,y

!$thread_num_g = omp_get_thread_num() gevals(thread_num_g) = gevals(thread_num_g) + 1 g = sin(x+y) end function g end module functions  module quadrature use omp_lib contains real(kind=8) function trapezoid(f, g, a, b, c, d, n) ! Estimate the integral of f(x) from a to b using the ! Trapezoid Rule with n points. ! Input: ! f: the function to integrate ! a: left endpoint x ! b: right endpoint x ! c: left endpoint y ! d: right endpoint y ! n: number of points to use ! Returns: ! the estimate of the integral implicit none real(kind=8), intent(in) :: a,b,c,d real(kind=8), external :: f,g integer, intent(in) :: n ! Local variables: integer :: j real(kind=8) :: h, trap_sum, xj h = (b-a)/(n-1) trap_sum = 0.5d0*(f(g, a, c, d) + f(g, b, c, d)) ! endpoint contributions !$omp parallel do private(xj) reduction(+ : trap_sum)
do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(g, xj, c, d)
enddo

trapezoid = h * trap_sum

end function trapezoid

subroutine error_table(f,g,a,b,c,d,nvals,int_true,method)

! Compute and print out a table of errors when the quadrature
! rule specified by the input function method is applied for
! each value of n in the array nvals.

implicit none
real(kind=8), intent(in) :: a,b,c,d, int_true
real(kind=8), external :: f, g, method
integer, dimension(:), intent(in) :: nvals

! Local variables:
integer :: j, n
real(kind=8) :: ratio, last_error, error, int_approx

print *, "      n         approximation        error       ratio"
last_error = 0.d0
do j=1,size(nvals)
n = nvals(j)
int_approx = method(f,g,a,b,c,d,n)
error = abs(int_approx - int_true)
ratio = last_error / error
last_error = error  ! for next n

print 11, n, int_approx, error, ratio
11     format(i8, es22.14, es13.3, es13.3)
enddo

end subroutine error_table


program test

use omp_lib

use functions, only: f, g, fevals, gevals, k

implicit none
real(kind=8) :: a,b,c,d,int_true
integer :: nvals(10)

real(kind=8) :: t1, t2, elapsed_time
integer(kind=8) :: tclock1, tclock2, clock_rate

nthreads = 1      ! for serial mode
!$nthreads = 4 ! for openmp !$ call omp_set_num_threads(nthreads)

fevals = 0

k = 1.d3   ! functions module variable for function f2
a = 0.d0
b = 2.d0
c = 1.d0
d = 4.d0
!    int_true = (-sin(b+c)+sin(b+d))-(-sin(a+c)+sin(a+d))
int_true = (-sin(b+d)+sin(b+c))-(-sin(a+d)+sin(a+c))
print 10, int_true
10 format("true integral: ", es22.14)
print *, " "  ! blank line

! values of n to test:   (larger values than before)
do i=1,10
nvals(i) = 5 * 2**(i-1)
enddo

! time the call to error_table:
call system_clock(tclock1)
call cpu_time(t1)
call error_table(f, g, a, b, c, d, nvals, int_true, trapezoid)
call cpu_time(t2)
call system_clock(tclock2, clock_rate)

elapsed_time = float(tclock2 - tclock1) / float(clock_rate)
print *, " "
print 11, elapsed_time
11 format("Elapsed time = ",f12.8, " seconds")

print 12, t2-t1
12 format("CPU time = ",f12.8, " seconds")

! print the number of function evaluations by each thread:
print 101,  i, fevals(i)
101     format("fevals by thread ",i2,": ",i13)
enddo

print 102, sum(fevals)
102 format("Total number of fevals: ",i10)

! print the number of function evaluations by each thread:
print 103,  i, gevals(i)
103     format("gevals by thread ",i2,": ",i13)
enddo

print 104, sum(gevals)
104 format("Total number of gevals: ",i10)

end program test


## 2.6. Workspace 6: MPI¶

• Makefile (makefile for this directory)

# $MYHPSC/codes/mpi/Makefile # Usage: to try out test1.f90, for example: # make test FNAME=test1 # or # make test FNAME=test1 NUM_PROCS=8 NUM_PROCS ?= 4 # default if not specified on command line or env variable FC = mpif90 FFLAGS = LFLAGS = .PHONY: clean, test FNAME ?= test %.o : %.f90$(FC) $(FFLAGS) -c$<

$(FNAME).exe:$(FNAME).f90
$(FC)$(FFLAGS) $(LFLAGS)$(FNAME).f90 -o $(FNAME).exe test:$(FNAME).exe
mpiexec -n $(NUM_PROCS) ./$(FNAME).exe

clean:
rm -f *.o *.exe

• copyvalue.f90 (passes a value from Process 0 to Processes 1-3 using MPI)

! $UWHPSC/codes/mpi/copyvalue.f90 ! ! Set value in Process 0 and copy this through a chain of processes ! and finally print result from Process numprocs-1. ! program copyvalue use mpi implicit none integer :: i, proc_num, num_procs,ierr integer, dimension(MPI_STATUS_SIZE) :: status call MPI_INIT(ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr) if (num_procs==1) then print *, "Only one process, cannot do anything" call MPI_FINALIZE(ierr) stop endif if (proc_num==0) then i = 55 print '("Process ",i3," setting i = ",i3)', proc_num, i call MPI_SEND(i, 1, MPI_INTEGER, 1, 21, & MPI_COMM_WORLD, ierr) else if (proc_num < num_procs - 1) then call MPI_RECV(i, 1, MPI_INTEGER, proc_num-1, 21, & MPI_COMM_WORLD, status, ierr) print '("Process ",i3," receives i = ",i3)', proc_num, i print '("Process ",i3," sends i = ",i3)', proc_num, i call MPI_SEND(i, 1, MPI_INTEGER, proc_num+1, 21, & MPI_COMM_WORLD, ierr) else if (proc_num == num_procs - 1) then call MPI_RECV(i, 1, MPI_INTEGER, proc_num-1, 21, & MPI_COMM_WORLD, status, ierr) print '("Process ",i3," ends up with i = ",i3)', proc_num, i endif call MPI_FINALIZE(ierr) end program copyvalue  • copyvalue2.f90 (passes a value from Process 3 to Processes 0-2 using MPI) !$UWHPSC/codes/mpi/copyvalue2.f90
!
! Set value in Process numprocs-1 and copy this through a chain of processes
! and finally print result from Process 0.
!

program copyvalue2

use mpi

implicit none

integer :: i, proc_num, num_procs,ierr
integer, dimension(MPI_STATUS_SIZE) :: status

call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)

if (num_procs .EQ. 1) then
print *, "Only one process, cannot do anything"
call MPI_FINALIZE(ierr)
stop
endif

if (proc_num .EQ. num_procs-1) then
i = 55
print '("Process ",i3," setting      i = ",i3)', proc_num, i

call MPI_SEND(i, 1, MPI_INTEGER, num_procs-2, 21, &
MPI_COMM_WORLD, ierr)

else if (proc_num .GT. 0) then

call MPI_RECV(i, 1, MPI_INTEGER, proc_num+1, 21, &
MPI_COMM_WORLD, status, ierr)

print '("Process ",i3," receives     i = ",i3)', proc_num, i
print '("Process ",i3," sends        i = ",i3)', proc_num, i-1

call MPI_SEND(i-1, 1, MPI_INTEGER, proc_num-1, 21, &
MPI_COMM_WORLD, ierr)

else if (proc_num .EQ. 0) then

call MPI_RECV(i, 1, MPI_INTEGER, 1, 21, &
MPI_COMM_WORLD, status, ierr)

print '("Process ",i3," ends up with i = ",i3)', proc_num, i
endif

call MPI_FINALIZE(ierr)

end program copyvalue2


module functions

implicit none
integer :: fevals_proc
real(kind=8) :: k
save

contains

real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x
integer proc_num, ierr

! keep track of number of function evaluations by
! each process:
fevals_proc = fevals_proc + 1

f = 1.d0 + x**3 + sin(k*x)

end function f

end module functions

module quadrature

contains

real(kind=8) function trapezoid(f, a, b, n)

! Estimate the integral of f(x) from a to b using the
! Trapezoid Rule with n points.

! Input:
!   f:  the function to integrate
!   a:  left endpoint
!   b:  right endpoint
!   n:  number of points to use
! Returns:
!   the estimate of the integral

implicit none
real(kind=8), intent(in) :: a,b
real(kind=8), external :: f
integer, intent(in) :: n

! Local variables:
integer :: j
real(kind=8) :: h, trap_sum, xj

h = (b-a)/(n-1)
trap_sum = 0.5d0*(f(a) + f(b))  ! endpoint contributions

do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(xj)
enddo

trapezoid = h * trap_sum

end function trapezoid



FFLAGS =
NUM_PROCS ?= 4   # default if not specified on command line or env variable

.PHONY: test clean test2

test: test.exe
mpiexec -n $(NUM_PROCS) test.exe test2: test2.exe mpiexec -n$(NUM_PROCS) test2.exe

test.exe: $(OBJECTS) mpif90$(FFLAGS) $(OBJECTS) -o test.exe test2.exe:$(OBJECTS2)
mpif90 $(FFLAGS)$(OBJECTS2) -o test2.exe

%.o : %.f90
mpif90 $(FFLAGS) -c$<

clean:
rm -f *.o *.exe *.mod


• part2/test.f90 (compute the same integral over all Processes using MPI)

program test

use mpi

use functions, only: f, fevals_proc, k

implicit none
real(kind=8) :: a,b,int_true, int_approx

integer :: proc_num, num_procs, ierr, n, fevals_total
integer, dimension(MPI_STATUS_SIZE) :: status

call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)

! All processes set these values so we don't have to broadcast:
k = 1.d3   ! functions module variable
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
n = 1000

! Each process keeps track of number of fevals:
fevals_proc = 0

if (proc_num==0) then
print '("Using ",i3," processes")', num_procs
print '("true integral: ", es22.14)', int_true
print *, " "  ! blank line
endif

call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for process 0 to print

! Note: In this version all processes call trap and repeat the
!       same work, so each should get the same answer.
int_approx = trapezoid(f,a,b,n)
print '("Process ",i3," with n = ",i8," computes int_approx = ",es22.14)', &
proc_num,n, int_approx

call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for all process to print

! print the number of function evaluations by each thread:
print '("fevals by Process ",i2,": ",i13)',  proc_num, fevals_proc

call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for all process to print

call MPI_REDUCE(fevals_proc, fevals_total, 1, &
MPI_DOUBLE_PRECISION, MPI_SUM, 0, &
MPI_COMM_WORLD, ierr)

if (proc_num==0) then
! This is wrong -- part of homework is to fix this:

!    fevals_total = 0   !! need to fix
print '("Total number of fevals:  ",i10)', fevals_total
endif

call MPI_FINALIZE(ierr)

end program test

• part2/test2.f90 (integral using Master-Worker paradigm)Makefile for this directory

program test2

use mpi

use functions, only: f, fevals_proc, k

implicit none
real(kind=8) :: a, b, int_true, int_approx, dx_sub, int_sub

integer :: proc_num, num_procs, ierr, n, fevals_total, n_sub, j, jj
integer, dimension(MPI_STATUS_SIZE) :: status
real(kind=8), allocatable, dimension(:,:) :: ab_sub
real(kind=8), allocatable, dimension(:) :: sub, int_array

call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)

! All processes set these values so we don't have to broadcast:
k = 1.d3   ! functions module variable
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
n = 1000

n_sub = num_procs-1 ! Set the number of Worker Processes

! Each process keeps track of number of fevals:
fevals_proc = 0

if (proc_num .GT. 0) then
allocate(sub(2))   ! to hold a column vector sent from master
endif

if (proc_num .EQ. 0) then
print '("Using ",i3," processes")', num_procs
print '("true integral: ", es22.14)', int_true
print *, " "  ! blank line

allocate(ab_sub(2,n_sub))    ! to hold a-b values for Workers
allocate(int_array(n_sub))	 ! to hold integral from Workers

ab_sub = 1.d0

! Process 0 (Master) sends other processes (Workers) left and right edges
! of the integral

dx_sub = (b-a) / n_sub

do j=1,n_sub
ab_sub(1,j) = a + (j-1)*dx_sub
ab_sub(2,j) = a + j*dx_sub
call MPI_SEND(ab_sub(1,j), 2, MPI_DOUBLE_PRECISION, j, j, &
MPI_COMM_WORLD, ierr)
enddo

! Master recieves part of the integral from the Workers and stores it in
! an array from any process

do j=1,n_sub
call MPI_RECV(int_sub, 1, MPI_DOUBLE_PRECISION, &
MPI_ANY_SOURCE, MPI_ANY_TAG, &
MPI_COMM_WORLD, status, ierr)
jj = status(MPI_TAG)
int_array(jj) = int_sub
enddo

int_approx = sum(abs(int_array))

endif

! Process 0 (Master) sends other processes (Workers) left and right edges
! of the integral

if(proc_num .GT. 0) then

call MPI_RECV(sub, 2, MPI_DOUBLE_PRECISION, &
0, MPI_ANY_TAG, &
MPI_COMM_WORLD, status, ierr)

int_sub = trapezoid(f,sub(1),sub(2),n)

j = status(MPI_TAG)   ! this is the row number
! (should agree with proc_num)

call MPI_SEND(int_sub, 1, MPI_DOUBLE_PRECISION, &
0, j, MPI_COMM_WORLD, ierr)

! print the number of function evaluations by each thread:
print '("fevals by Process ",i2,": ",i13)', proc_num, fevals_proc

endif

! Master Process prints out the result:
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for all processes to print

if (proc_num .EQ. 0) then
print '("Trapezoid approximation with ",i8," total points: ",es22.14)',&
n_sub*n, int_approx
endif

call MPI_FINALIZE(ierr)

end program test2


## 2.7. Project 1: 1D Integral and MPI¶

• functions.f90, quadrature.f90, test2.f90, Makefile (compute a 1D integral using MPI by dividing up intervals between an indeterminate number of processes)


FFLAGS =
NUM_PROCS ?= 4   # default if not specified on command line or env variable

.PHONY: test clean test2 test3

test: test.exe
mpiexec -n $(NUM_PROCS) test.exe test2: test2.exe mpiexec -n$(NUM_PROCS) test2.exe

test3: test3.exe
mpiexec -n $(NUM_PROCS) test3.exe test.exe:$(OBJECTS)
mpif90 $(FFLAGS)$(OBJECTS) -o test.exe

test2.exe: $(OBJECTS2) mpif90$(FFLAGS) $(OBJECTS2) -o test2.exe test3.exe:$(OBJECTS3)
mpif90 $(FFLAGS)$(OBJECTS3) -o test3.exe

%.o : %.f90
mpif90 $(FFLAGS) -c$<

clean:
rm -f *.o *.exe *.mod


module functions

implicit none
integer :: fevals_proc
real(kind=8) :: k
save

contains

real(kind=8) function f(x)
implicit none
real(kind=8), intent(in) :: x
integer proc_num, ierr

! keep track of number of function evaluations by
! each process:
fevals_proc = fevals_proc + 1

f = 1.d0 + x**3 + sin(k*x)

end function f

end module functions

module quadrature

contains

real(kind=8) function trapezoid(f, a, b, n)

! Estimate the integral of f(x) from a to b using the
! Trapezoid Rule with n points.

! Input:
!   f:  the function to integrate
!   a:  left endpoint
!   b:  right endpoint
!   n:  number of points to use
! Returns:
!   the estimate of the integral

implicit none
real(kind=8), intent(in) :: a,b
real(kind=8), external :: f
integer, intent(in) :: n

! Local variables:
integer :: j
real(kind=8) :: h, trap_sum, xj

h = (b-a)/(n-1)
trap_sum = 0.5d0*(f(a) + f(b))  ! endpoint contributions

do j=2,n-1
xj = a + (j-1)*h
trap_sum = trap_sum + f(xj)
enddo

trapezoid = h * trap_sum

end function trapezoid


program test2

use mpi

use functions, only: f, fevals_proc, k

implicit none
real(kind=8) :: a, b, int_true, int_approx, dx_sub, int_sub

integer :: proc_num, num_procs, ierr, n, fevals_total, n_sub, j, jj
integer, dimension(MPI_STATUS_SIZE) :: status
real(kind=8), allocatable, dimension(:,:) :: ab_sub
real(kind=8), allocatable, dimension(:) :: sub, int_array

call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_procs, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, proc_num, ierr)

! All processes set these values so we don't have to broadcast:
k = 1.d3   ! functions module variable
a = 0.d0
b = 2.d0
int_true = (b-a) + (b**4 - a**4) / 4.d0 - (1.d0/k) * (cos(k*b) - cos(k*a))
n = 1000

n_sub = num_procs-1 ! Set the number of Worker Processes

! Each process keeps track of number of fevals:
fevals_proc = 0

if (proc_num .GT. 0) then
allocate(sub(2))   ! to hold a column vector sent from master
endif

if (proc_num .EQ. 0) then
print '("Using ",i3," processes")', num_procs
print '("true integral: ", es22.14)', int_true
print *, " "  ! blank line

allocate(ab_sub(2,n_sub))    ! to hold a-b values for Workers
allocate(int_array(n_sub))	 ! to hold integral from Workers

ab_sub = 1.d0

! Process 0 (Master) sends other processes (Workers) left and right edges
! of the integral

dx_sub = (b-a) / n_sub

do j=1,n_sub
ab_sub(1,j) = a + (j-1)*dx_sub
ab_sub(2,j) = a + j*dx_sub
call MPI_SEND(ab_sub(1,j), 2, MPI_DOUBLE_PRECISION, j, j, &
MPI_COMM_WORLD, ierr)
enddo

! Master recieves part of the integral from the Workers and stores it in
! an array from any process

do j=1,n_sub
call MPI_RECV(int_sub, 1, MPI_DOUBLE_PRECISION, &
MPI_ANY_SOURCE, MPI_ANY_TAG, &
MPI_COMM_WORLD, status, ierr)
jj = status(MPI_TAG)
int_array(jj) = int_sub
enddo

int_approx = sum(abs(int_array))

endif

! Process 0 (Master) sends other processes (Workers) left and right edges
! of the integral

if(proc_num .GT. 0) then

call MPI_RECV(sub, 2, MPI_DOUBLE_PRECISION, &
0, MPI_ANY_TAG, &
MPI_COMM_WORLD, status, ierr)

int_sub = trapezoid(f,sub(1),sub(2),n)

j = status(MPI_TAG)   ! this is the row number
! (should agree with proc_num)

call MPI_SEND(int_sub, 1, MPI_DOUBLE_PRECISION, &
0, j, MPI_COMM_WORLD, ierr)

! print the number of function evaluations by each thread:
print '("fevals by Process ",i2,": ",i13)', proc_num, fevals_proc

endif

! Master Process prints out the result:
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for all processes to print

if (proc_num .EQ. 0) then
print '("Trapezoid approximation with ",i8," total points: ",es22.14)',&
n_sub*n, int_approx
endif

call MPI_FINALIZE(ierr)

end program test2


## 2.8. Project 2: 20D Integral and the Monte Carlo Method¶

• functions.f90, quadrature_mc.f90, random_util.f90, test_quad_mc.f90, Makefile (compute integral using Monte Carlo method for a 20 dimensional integral - serial)

OBJECTS = random_util.o functions.o quadrature_mc.o test_quad_mc.o
FFLAGS =

.PHONY: test plot clean clobber

test: test.exe
./test.exe

test.exe: $(OBJECTS) gfortran$(FFLAGS) $(OBJECTS) -o test.exe %.o : %.f90 gfortran$(FFLAGS) -c  $< mc_quad_error.txt: test.exe ./test.exe plot: mc_quad_error.txt python plot_mc_quad_error.py clean: rm -f *.o *.exe *.mod clobber: clean rm -f mc_quad_error.txt mc_quad_error.png  ! --- ! This module returns the value of a function: ! sum(x**2) over all dimensions ! it also sets the number of evaulations of this function ! --- module functions ! --- ! gevals is a module variable ! Example use: use functions, only gevals ! --- implicit none integer :: gevals save contains ! --- ! Multiple dimension function ! Example use: f = g(xin,ndim) ! Inputs: x (vector of length ndim), ndim (number of dimensions) ! Output: g (the value of the function over all dimensions) ! --- function g(x,ndim) ! --- ! Declare parameters and vectors ! --- implicit none integer, intent(in) :: ndim real(kind=8), intent(in) :: x(ndim) real(kind=8) :: g integer :: i ! --- ! We need to initialise the number of evaluations to zero ! --- g = 0.d0 ! --- ! Evaluate function in 20 dimensions ! --- do i=1,ndim g = g + x(i)**2 enddo ! --- ! Update the number of function evaluations ! --- gevals = gevals + 1 end function g end module functions  ! --- ! This module performs Monte Carlo Integration by: ! 1) Generating random input over a number of points ! 2) Evaluating the function at these points for 20 dimensions ! 3) Summing the function and dividing by the number of points for the average ! 4) Multiplying the average by the base for the integral ! --- module quadrature_mc ! --- ! Uses random seed and functions modules ! --- use random_util, only: init_random_seed use functions, only: g contains ! --- ! Monte Carlo Integration function ! Example use: int_mc = quad_mc(g,a,b,ndim,npoints) ! Inputs: g (from functions module), a and b (vectors of length ndim) ! ndim (number of dimensions) npoints (number of points) ! Output: quad_mc (the value of the integral) ! --- function quad_mc(g, a, b, ndim, npoints) ! --- ! Declare parameters and arrays. ! --- implicit none real(kind=8), external :: g real(kind=8), dimension(ndim), intent(in) :: a, b integer, intent(in) :: ndim, npoints integer :: seed1, i, j real(kind=8), dimension(:), allocatable :: f, xin, x real(kind=8) :: quad_mc ! --- ! Number of points & dimensions is unknown, so must allocate: ! --- allocate(x(npoints)) ! random number array 0 to 1 allocate(xin(ndim)) ! random number array a+(b-a)x allocate(f(npoints)) ! result of sum(x**2) over all dimensions ! for n points ! --- ! Generate random number array x once per call of quad_mc ! Length of the array is the number of random points ! --- seed1 = 12345 ! seed1 = 12345 for repeatable seed ! seed1 = 0 for random seed call init_random_seed(seed1) call random_number(x) ! --- ! Generate random input based on a and b and ! compute sum(x**2) for all dimensions ! --- do i=1,npoints do j=1,ndim xin(j)=a(j)+(b(j)-a(j))*x(i) enddo f(i) = g(xin,ndim) enddo ! --- ! Return the value of the Monte Carlo Integral by multiplying the ! average value by the base ! --- quad_mc = product(b-a)*(sum(f)/npoints) end function quad_mc end module quadrature_mc  ! --- ! This module returns seed for the random_number function ! --- module random_util contains subroutine init_random_seed(seed1) ! --- ! Declare the parameters ! --- integer, intent(inout) :: seed1 integer :: clock integer, dimension(:), allocatable :: seed ! --- ! Determine how many numbers needed to seed and allocate ! --- call random_seed(size = nseed) allocate(seed(nseed)) ! --- ! If seed1 = 0 then set seed1 randomly using the system clock. ! This will give different sequences of random numbers ! from different runs, so results are not reproducible. ! --- if (seed1 == 0) then call system_clock(count = clock) seed1 = clock endif ! --- ! If seed1 > 0 then results will be reproducible since calling this ! twice with the same seed will initialize the random ! number generator so the same sequence is generated. ! Once seed1 is set, set the other elements of the seed array by adding ! multiples of 37 as suggested in the documentation. ! --- do i=1,nseed seed(i) = seed1 + 37*(i-1) enddo ! --- ! Seed the generator ! --- call random_seed(put = seed) ! --- ! Deallocate the seed ! --- deallocate(seed) end subroutine init_random_seed end module random_util  ! --- ! This program computes the Monte Carlo Integral by doubling the number ! of points every time ! --- program test_quad_mc ! --- ! Uses functions, quadrature_mc and random_util modules ! --- use functions, only: g, gevals use quadrature_mc, only: quad_mc use random_util, only: init_random_seed ! --- ! Declare the parameters ! --- implicit none integer, parameter :: ndim = 20 real(kind=8), dimension(ndim) :: a, b real(kind=8) :: volume, int_true, int_mc, int_mc_total, error integer :: i, seed1, n_total, npoints ! --- ! Need to initialize number of g evaulations to zero, because it's a module variable ! --- gevals = 0 ! --- ! Open a file to write to ! --- open(unit=25, file='mc_quad_error.txt', status='unknown') ! --- ! a and b are the vectors for the integral limits in all dimensions ! and are the same in all directions ! --- do i=1,ndim a(i) = 2.d0 b(i) = 4.d0 enddo ! --- ! Compute the true integral for special case where ! g(x) = sum(x**2) over all dimensions ! True integral = base * height, ! where height is the sum of the average value of the true integral in all dimensions ! --- volume = product(b-a) ! = product of b(i)-a(i) of ndim dimensions int_true = (volume) * sum((b**3 - a**3) / (3.d0*(b-a))) print '("Testing Monte Carlo quadrature in ",i2," dimensions")', ndim print '("True integral: ", es22.14)', int_true ! --- ! Start with Monte Carlo using only a few points. ! --- npoints = 10 ! --- ! Loop to successively and double the number of points used: ! --- do i=1,17 ! --- ! Compute integral using Monte Carlo method for a given set of points ! compute relative error and write these to file with number of points ! --- int_mc = quad_mc(g,a,b,ndim,npoints) error = abs(int_mc - int_true) / abs(int_true) write(25,'(i10,e23.15,e15.6)') npoints, int_mc, error ! --- ! Double the number of points ! --- npoints = 2*npoints enddo print '("Final approximation to integral: ",es22.14)',int_mc print *, "Total g evaluations: ",gevals end program test_quad_mc  from pylab import * # read in three columns from file and unpack into 3 arrays: n,int_approx,error = loadtxt('mc_quad_error.txt',unpack=True) figure(1) clf() loglog(n,error,'-o',label='Monte-Carlo') loglog([1,1e7],[1,sqrt(1e-7)],'k',label='1 / sqrt(N)') legend() xlabel('number of MC points used') ylabel('abs(error)') title('Log-log plot of relative error in MC quadrature') savefig('mc_quad_error.png')  ## 2.9. Project 3: Laplace’s Equation and the Monte Carlo Method¶ • laplace_mc.py (python version) • problem_description.f90, laplace_mc.f90, mc_walk.f90, random_util.f90, Makefile (compute the solution to Laplace’s Equation using the Monte Carlo Method - serial) • test1.f90, test2.f90 (trials I made along the way) • plot_mc_quad_error.py (plot the result) """ Random walk approximate solution to Laplace's equation u_{xx} + u{yy} = 0. Set demo==True to plot a few random walks interactively. With demo==False many more walks are used to estimate the solution. The boundary conditions are set for this test problem by evaluating the true solution u(x,y) = x^2 - y^2 of Laplace's equation on the boundary of the domain. Moreover, the exact solution to the discrete equations U_{i-1,j} + U_{i+1,j} + U_{i,j-1} + U_{i,j+1} - 4u_{ij} = 0 with boundary values obtained in this way is easily computed. It is simply given by evaluating the exact solution at the grid points, U_{ij} = x_i^2 - y_j^2 This is because the centered difference approximation to the second derivative is exact when applied to a quadratic function. This code implements a random walk on a lattice (rectangular grid) where in each step the walk goes in one of 4 directions based on the value of a random number that's uniformly distributed in [0,1]. """ from pylab import * from numpy.random import RandomState import time, sys # Set plot_walk to: # True ==> 10 walks will be done, plotting each for illustration. # False ==> Many more walks taken to investigate convergence. plot_walk = False # problem description: ax = 0. bx = 1. ay = 0.4 by = 1. nx = 19 ny = 11 dx = (bx-ax)/float(nx+1) dy = (by-ay)/float(ny+1) debug = False # to turn on some additional printing def utrue(x,y): """ Return true solution for a test case where this is known. This function should satisfy u_{xx} + u_{yy} = 0. """ utrue = x**2 - y**2 return utrue def uboundary(x,y): """ Return u(x,y) if (x,y) is on the boundary. """ if (x-ax)*(x-bx)*(y-ay)*(y-by) != 0: print "*** Not on the boundary" raise Exception("*** uboundary called at non-boundary point") return nan # assume we know the true solution and can just evaluate there: ub = utrue(x,y) return ub def plot_initialize(i0,j0): """ Set up plot for demo. """ figure(1,figsize=(7,8)) clf() axes([.1,.3,.8,.6]) # leave room for printing messages x = linspace(ax,bx,nx+2) y = linspace(ay,by,ny+2) X,Y = meshgrid(x,y) # turn into 2d arrays plot(X,Y,'k') plot(X.T,Y.T,'k') plot([ax+i0*dx],[ay+j0*dy],'ro') axis('scaled') title("Initial point for random walk") draw() show() time.sleep(1) # pause to see the initial location def plot_ub(xb,yb,ub): """ Called when we hit the boundary """ plot([xb], [yb], 'ro') text(ax, ay-0.2, 'Hit boundary where u = %9.6f' % ub, fontsize=20) draw() time.sleep(2) def plot_step(iold, jold, i, j, istep): """ Called each step of the random walk to plot progress """ # plot next segment of walk and new point in red: plot([ax+iold*dx, ax+i*dx], [ay+jold*dy, ay+j*dy], 'r',linewidth=2) plot([ax+i*dx], [ay+j*dy], 'ro') title("After %6i random steps" % (istep+1)) draw() time.sleep(0.05) # redraw last segment and point in blue: plot([ax+iold*dx, ax+i*dx], [ay+jold*dy, ay+j*dy], 'b',linewidth=2) plot([ax+i*dx], [ay+j*dy], 'bo') draw() def random_walk(i0, j0, max_steps): """ Take one random walk starting at (i0,j0) until we reach the boundary or exceed max_steps steps. Return the value at the boundary point reached, or nan if we failed. """ # starting point: i = i0 j = j0 if plot_walk: plot_initialize(i,j) # generate as many random numbers as we could possibly need # for this walk, since this is much faster than generating one at a time: r = random_generator.uniform(0., 1., size=max_steps) if debug: print "+++ generated r: ", r for istep in range(max_steps): iold,jold = i,j # needed for plotting only # Take the next random step with equal probability in each # direction: if r[istep] < 0.25: i = i-1 # step left elif r[istep] < 0.5: i = i+1 # step right elif r[istep] < 0.75: j = j-1 # step down else: j = j+1 # step up if plot_walk: plot_step(iold,jold, i,j, istep) # check if we hit the boundary: if i*j*(nx+1-i)*(ny+1-j) == 0: xb = ax + i*dx yb = ay + j*dy ub = uboundary(xb, yb) if plot_walk: plot_ub(xb,yb,ub) if debug: print "Hit boundary at (%7.3f, %7.3f) after %i7 steps, ub = %7.3f" \ % (xb,yb,istep+1,ub) break # end the walk if istep==(max_steps-1): if debug: print "Did not hit boundary after %i steps" % max_steps if plot_walk: text(0, -0.2, "Did not hit boundary after %i steps" \ % max_steps, fontsize=20) draw() time.sleep(2) ub = nan return ub def many_walks(i0, j0, max_steps, n_mc): ub_sum = 0. # to accumulate boundary values reached from all walks n_success = 0 # to keep track of how many walks reached boundary for k in range(n_mc): i = i0 j = j0 ub = random_walk(i0, j0, max_steps) if not isnan(ub): # use this result unless walk didn't reach boundary ub_sum += ub n_success += 1 u_mc = ub_sum / n_success # average over successful walks return u_mc, n_success if __name__ == "__main__": # MAIN PROGRAM # Try it out from a specific (x0,y0): x0 = 0.9 y0 = 0.6 i0 = round((x0-ax)/dx) j0 = round((y0-ay)/dy) # shift (x0,y0) to a grid point if it wasn't already: x0 = ax + i0*dx y0 = ay + j0*dy u_true = utrue(x0,y0) print "True solution of PDE: u(%7.3f, %7.3f) = %9.5f" % (x0,y0,u_true) print "Note: with solution used in demo this is also the solution to the" print " the finite-difference equations on the same grid." ans = raw_input("Enter seed for random generator or <return> .... ") if ans == "": seed = None # will cause a random seed to be used else: seed = int(ans) # convert string returned from raw_input into integer random_generator = RandomState(seed) # maximum number of step in each before giving up: max_steps = 100*max(nx, ny) # initial number of Monte-Carlo walks to take: n_mc = 10 u_mc, n_success = many_walks(i0, j0, max_steps, n_mc) error = abs((u_mc - u_true) / u_true) print "After %8i random walks, u = %15.9f, rel. error = %15.6e" \ % (n_success, u_mc, error) if plot_walk: sys.exit() # quit after a few walks # start accumulating totals: u_mc_total = u_mc n_total = n_success for i in range(12): u_sum_old = u_mc_total * n_total u_mc, n_success = many_walks(i0, j0, max_steps, n_mc) u_sum_new = u_mc * n_success n_total = n_total + n_success u_mc_total = (u_sum_old + u_sum_new) / n_total error = abs((u_mc_total - u_true) / u_true) print "After %8i random walks, u = %15.9f, rel. error = %15.6e" \ % (n_total, u_mc_total, error) n_mc = 2*n_mc # double number of trials for next iteration  OBJECTS = random_util.o problem_description.o mc_walk.o test1.o OBJECTS2 = random_util.o problem_description.o mc_walk.o test2.o OBJECTS3 = random_util.o problem_description.o mc_walk.o laplace_mc.o FFLAGS = .PHONY: test test2 laplace test: test.exe ./test.exe test2: test2.exe ./test2.exe laplace: laplace.exe ./laplace.exe test.exe:$(OBJECTS)
gfortran $(FFLAGS)$(OBJECTS) -o test.exe

test2.exe: $(OBJECTS2) gfortran$(FFLAGS) $(OBJECTS2) -o test2.exe laplace.exe:$(OBJECTS3)
gfortran $(FFLAGS)$(OBJECTS3) -o laplace.exe

%.o : %.f90
gfortran $(FFLAGS) -c$<

clean:
rm -f *.o *.exe *.mod


module problem_description

implicit none
real(kind=8), parameter :: ax = 0.0d0
real(kind=8), parameter :: bx = 1.0d0
real(kind=8), parameter :: ay = 0.4d0
real(kind=8), parameter :: by = 1.0d0
integer, parameter :: nx = 19
integer, parameter :: ny = 11
real(kind=8), parameter :: dx = (bx - ax) / (nx+1)
real(kind=8), parameter :: dy = (by - ay) / (ny+1)
save

contains

function utrue(x, y)

! True solution for comparison, if known.

implicit none
real(kind=8), intent(in) :: x,y
real(kind=8) :: utrue

utrue = x**2 - y**2

end function utrue

function uboundary(x, y)

! Return u(x,y) assuming (x,y) is a boundary point.

implicit none
real(kind=8), intent(in) :: x,y
real(kind=8) :: uboundary

if ((x-ax)*(x-bx)*(y-ay)*(y-by) .ne. 0.d0) then
print *, "*** Error -- called uboundary at non-boundary point"
stop
endif

uboundary = utrue(x,y)   ! assuming we know this

end function uboundary

end module problem_description

program laplace_mc
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, many_walks, nwalks
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy

real(kind=8) :: u_mc, x0, y0, rel_error
integer :: n_mc, n_success, i0, j0, max_steps

! Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6

i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)

! shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy

u_true = utrue(x0,y0)
!nwalks = 0
n_mc = 10
max_steps = 100*max(nx, ny)
! do i=1,13
! call many_walks(i0, j0, max_steps, n_mc, u_mc, n_success)

! ---
! Open a file to write to
! ---
open(unit=25, file='laplace_mc.txt', status='unknown')
!rel_error = abs(u_true-u_mc)/u_true
print *, ""
print '(" True solution of PDE at x =",f10.4," y =",f10.4)', x0, y0
print '(" u_true =   ",es13.3)', u_true
print *, ""
print *, "    Walks    Monte Carlo Solution    Relative Error"
do i=1,13
nwalks=0
call many_walks(i0, j0, max_steps, n_mc, u_mc, n_success)
n_mc = 2.0d0 * n_mc
rel_error = abs(u_true-u_mc)/u_true
print '("  ",i8,"      ",es13.3,"        ",es13.3)', nwalks, u_mc, rel_error
write(25,'(i10,e23.15,e15.6)') nwalks, u_mc, rel_error
enddo
end program laplace_mc

! ---
! This has two functions random_walk and many_walks
! ---
module mc_walk
! ---
! Uses random seed module
! ---
use random_util, only: init_random_seed
use problem_description, only: uboundary, ax, ay, dx, dy, nx, ny
! ---
! nwalks is a module variable
! Example use: use mc_walk, only nwalks
! ---
implicit none
integer :: nwalks
!integer :: iabort
save

contains

! ---
! A single random walk
! Inputs: i (step in x), j (step in y), max_steps (maximum number of steps)
! Outputs: ub (value at boundary), iabort (0 for success, 1 for failure)
! Example use: ub, iabort = random_walk(0, 0, 100, ub, iabort)
!
!    i:  0,  1,.. nx, nx+1
! j:
! 0,   (BC) (BC) (BC) (BC)
! 1,   (BC)           (BC)
! ...  (BC)           (BC)
! nj   (BC)           (BC)
! nj+1 (BC) (BC) (BC) (BC)
! ---
subroutine random_walk(i0, j0, max_steps, ub, iabort)
! ---
! Declare parameters
! ---
implicit none
integer, intent(in) :: i0, j0, max_steps 	! i0 and j0 are inputs (static)
integer :: i, j 							! i and j are dummy variables
real(kind=8), intent(out) :: ub
integer, intent(out) :: iabort
real(kind=8), dimension(:), allocatable :: x
integer :: istep, seed1
real(kind=8) :: xb, yb
! ---
! Number of steps is unknown, so we must allocate:
! ---
allocate(x(max_steps)) ! random number array 0 to 1
! ---
! Generate random number array x (0 to 1)
! ---
seed1 = 0   ! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)
i = i0
j = j0
!nwalks = 0
! ---
! Go through the vector of random numbers and
! make moves depending on the value of x
! ---
do istep=1,max_steps
if (x(istep) .LT. 0.25) then
i = i-1    ! go left
! print *, "go left"
! print *, "i=", i
! print *, "istep=", istep
! print *, "x=", x(istep)
else if (x(istep) .LT. 0.5) then
i = i+1    ! go right
! print *, "go right"
! print *, "i=", i
! print *, "istep=", istep
! print *, "x=", x(istep)
else if (x(istep) .LT. 0.75) then
j = j-1    ! go down
! print *, "go up"
! print *, "j=", j
! print *, "istep=", istep
! print *, "x=", x(istep)
else
j = j+1    ! go up
! print *, "go down"
! print *, "j=", j
! print *, "istep=", istep
! print *, "x=", x(istep)
endif

! ---
! What if we are at the boundary?
! ---
if (i*j*(nx+1 - i)*(ny+1 - j) .EQ. 0) then
! print *, "we are at the boundary"
! print *, " "
xb = ax + i*dx 			! x is ax or ax+i*dx
yb = ay + j*dy			! y is ay or ay+j*dy
! print *, "xb=",xb
! print *, "yb=",yb
ub = uboundary(xb, yb)	! the boundary value is then computed
iabort = 0				! success
go to 99				! end do loop
endif
! ---
! What if we never reach the boundary within max steps?
! ---
if (istep .EQ. max_steps) then
! 	print *, "we aborted"
! 	print *, " "
ub = 0					! set ub to zero
iabort = 1				! failure
endif
enddo

99  continue

end subroutine random_walk

subroutine many_walks(i0, j0, max_steps, n_mc, u_mc, n_success)
! ---
! Declare parameters
! ---
implicit none
integer, intent(in) :: i0, j0, max_steps 		! i0 and j0 are inputs (static)
integer, intent(in) :: n_mc
integer :: i, j, k, iabort !, nwalks		! i and j are dummy variables
real(kind=8), intent(out) :: u_mc
integer, intent(out) :: n_success
real(kind=8) :: ub_sum, ub

ub_sum = 0
n_success = 0
!nwalks = 0
do k=1,n_mc
!nwalks = 0
i = i0
j = j0
call random_walk(i, j, max_steps, ub, iabort)
if(iabort .NE. 1) then
ub_sum = ub_sum + ub
n_success = n_success + 1
nwalks = nwalks +1
endif
u_mc = ub_sum/n_success

enddo

end subroutine many_walks

end module mc_walk

! ---
! This module returns seed for the random_number function
! ---
module random_util

contains

subroutine init_random_seed(seed1)
! ---
! Declare the parameters
! ---
integer, intent(inout) :: seed1
integer :: clock
integer, dimension(:), allocatable :: seed
! ---
! Determine how many numbers needed to seed and allocate
! ---
call random_seed(size = nseed)
allocate(seed(nseed))
! ---
! If seed1 = 0 then set seed1 randomly using the system clock.
! This will give different sequences of random numbers
! from different runs, so results are not reproducible.
! ---
if (seed1 == 0) then
call system_clock(count = clock)
seed1 = clock
endif
! ---
! If seed1 > 0 then results will be reproducible since calling this
! twice with the same seed will initialize the random
! number generator so the same sequence is generated.
! Once seed1 is set, set the other elements of the seed array by adding
! multiples of 37 as suggested in the documentation.
! ---
do i=1,nseed
seed(i) = seed1 + 37*(i-1)
enddo
! ---
! Seed the generator
! ---
call random_seed(put = seed)
! ---
! Deallocate the seed
! ---
deallocate(seed)

end subroutine init_random_seed

end module random_util

program test1
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, nwalks
real(kind=8) :: ub
integer :: iabort, i, j, max_steps
integer, parameter :: nx = 19
integer, parameter :: ny = 11
! range of i is 1 to nx
! range of j is 1 to nj

i = 1
j = 1
max_steps = 100*max(nx, ny)
call random_walk(i, j, max_steps, ub, iabort)
print *, "ub=", ub
print *, "iabort=", iabort
print *, "nwalks=", nwalks

end program test1

program test2
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, many_walks, nwalks
real(kind=8) :: u_mc
integer :: n_mc, n_success, i, j, max_steps
integer, parameter :: nx = 19
integer, parameter :: ny = 11
! range of i is 1 to nx
! range of j is 1 to nj

i = 1
j = 1
max_steps = 100*max(nx, ny)
call many_walks(i, j, max_steps, n_mc, u_mc, n_success)
print *, "u_mc=", u_mc
print *, "n_success=", n_success

end program test2

from pylab import *

# read in three columns from file and unpack into 3 arrays:

figure(1)
clf()
loglog(n,error,'-o',label='Monte-Carlo')
loglog([1,1e7],[1,sqrt(1e-7)],'k',label='1 / sqrt(N)')
legend()
xlabel('number of MC points used')
ylabel('abs(error)')
title('Log-log plot of relative error in MC quadrature')



## 2.10. Project 4: Laplace’s Equation, MPI and the Monte Carlo Method¶

• problem_description.f90, laplace_mc.f90, mc_walk.f90, random_util.f90, Makefile (compute the solution to Laplace’s Equation using the Monte Carlo Method - using MPI)

• test1.f90, test1b.f90, test2.f90 (trials I made along the way)

# $MYHPSC/project/part4/Makefile OBJECTS = random_util.o problem_description.o mc_walk.o laplace_mc.o OBJECTS2 = random_util.o problem_description.o mc_walk.o test1.o OBJECTS2b = random_util.o problem_description.o mc_walk.o test1b.o OBJECTS3 = random_util.o problem_description.o mc_walk.o test2.o NUMPROCS ?= 4 FLAGS ?= .PHONY: mpi laplace test1 test1b test2 mpi: testquad.exe mpiexec -n$(NUMPROCS) ./testquad.exe

test1: test1.exe
./test1.exe

test1b: test1b.exe
./test1b.exe

test2: test2.exe
./test2.exe

testquad.exe: $(OBJECTS) mpif90$(FLAGS) $(OBJECTS) -o testquad.exe test1.exe:$(OBJECTS2)
gfortran $(FFLAGS)$(OBJECTS2) -o test1.exe

test1b.exe: $(OBJECTS2b) gfortran$(FFLAGS) $(OBJECTS2b) -o test1b.exe test2.exe:$(OBJECTS3)
gfortran $(FFLAGS)$(OBJECTS3) -o test2.exe

%.o : %.f90
mpif90 -c $(FLAGS)$<

laplace: laplace.exe
./laplace.exe

laplace.exe: $(OBJECTS) gfortran$(FFLAGS) $(OBJECTS) -o laplace.exe #%.o : %.f90 # gfortran$(FFLAGS) -c  \$<

clean:
rm -f *.o *.exe *.mod

! ---
! This module returns the true value and the value at the boundary
! ---
module problem_description
! ---
! Initilize parameters
! ---
implicit none
real(kind=8), parameter :: ax = 0.0d0
real(kind=8), parameter :: bx = 1.0d0
real(kind=8), parameter :: ay = 0.4d0
real(kind=8), parameter :: by = 1.0d0
integer, parameter :: nx = 19
integer, parameter :: ny = 11
real(kind=8), parameter :: dx = (bx - ax) / (nx+1)
real(kind=8), parameter :: dy = (by - ay) / (ny+1)
save

contains
! ---
! Function for the true solution (if known)
! Inputs: x and y
! Output: z
! ---
function utrue(x, y)

implicit none
real(kind=8), intent(in) :: x,y
real(kind=8) :: utrue

utrue = x**2 - y**2

end function utrue
! ---
! Function for the value at the boundary
! Inputs: x and y
! Output: z
! ---
function uboundary(x, y)

implicit none
real(kind=8), intent(in) :: x,y
real(kind=8) :: uboundary
! ---
! If we are not at the boundary we print an error and stop
! ---
if ((x-ax)*(x-bx)*(y-ay)*(y-by) .ne. 0.d0) then
print *, "*** Error -- called uboundary at non-boundary point"
stop
endif
! ---
! If we are at the boundary we compute the solution
! ---
uboundary = utrue(x,y)   ! assuming we know this

end function uboundary

end module problem_description

! ---
! This program loops through the Monte Carlo solution of Laplace's equation,
! doubling the number of sample points every loop using MPI
! ---
program laplace_mc
! ---
! ALL PROCESSES: Modules and functions
! ---
use mpi
use random_util, only: init_random_seed
use mc_walk, only: random_walk, many_walks, n_walks, x, n_success_proc
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy
! ---
! ALL PROCESSES: Parameters, vectors and arrays
! ---
implicit none
real(kind=8) :: u_mc, x0, y0, rel_error, u_true
integer :: n_mc, n_success, i0, j0, max_steps, i, &
process_num, num_processes, ierr, &
istep, seed1, n_success_total
logical :: debug_many
! ---
! ALL PROCESSES: Number of steps is unknown, so we must allocate -
! 200000 is the maximum number of moves
! ---
n_mc = 10		! This must be at least 3 as there are 3 processes
! Use n_mc = 10 if not debugging random_walk
max_steps = 200000*max(nx, ny)
allocate(x(max_steps)) ! random number array 0 to 1
! ----
! ALL PROCESSES: Initialize MPI
! ----
call MPI_INIT(ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_processes, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_num, ierr)
! ---
! ALL PROCESSES: Generate random number array x (0 to 1)
! ---
seed1 = 12345   ! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
seed1 = seed1 + 97*process_num 	! unique for each process
print '(" Seed for process",i2, " is ",i5)', process_num, seed1
call MPI_BARRIER(MPI_COMM_WORLD,ierr) ! wait for processes to print
call init_random_seed(seed1)
call random_number(x)
! ----
! ALL PROCESSES: Try it out from a specific (x0,y0):
! ----
x0 = 0.9
y0 = 0.6
i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)
! ----
! ALL PROCESSES: Shift (x0,y0) to a grid point if it wasn't already:
! ----
x0 = ax + i0*dx
y0 = ay + j0*dy
u_true = utrue(x0,y0)
n_walks = 0
! ---
! MASTER PROCESS: Open a file to write to, provide true solution and print output headers
! ---
if (process_num .EQ. 0) then
open(unit=25, file='laplace_mc.txt', status='unknown')
print '(" True solution of PDE at x =",f10.4," y =",f10.4)', x0, y0
print '(" u_true =",e23.15)', u_true
print *, ""
print *, "  Successes    Monte Carlo Solution   Relative Error"
endif
! ---
! ALL PROCESSES: Loop through and call many_walks, doubling the number
! of Monte Carlo points by two each time
! ---
! MASTER PROCESS: Compute relative error and print output of many_walks
! ---
debug_many = .False.		! Best to remove the do loop if you're trying this
do i=1,13
call many_walks(i0, j0, max_steps, n_mc, u_mc, n_success, debug_many)
n_mc = 2.0d0 * n_mc
if (process_num .EQ. 0) then
rel_error = abs(u_true-u_mc)/u_true
print '("  ",i10," ",e23.15," ",e23.15)', n_success, u_mc, rel_error
write(25,'(i10,e23.15,e15.6)') n_success, u_mc, rel_error
endif
enddo
! ---
! ALL PROCESSES: Wait for process 0 to print
! ---
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
! ---
! MASTER PROCESS: Reduce n_success_proc to n_success_total
! ---
call MPI_REDUCE(n_success_proc, n_success_total, 1, &
MPI_DOUBLE_PRECISION, MPI_SUM, 0, MPI_COMM_WORLD, ierr)
! ---
! MASTER PROCESS: Reduce n_success_proc to n_success_total
! ---
if (process_num .EQ. 0) then
print '("Final approximation to u(x0,y0):",e23.15)', u_mc
print '("Total number of walks by all processes:",i10)', n_success_total
endif
! ---
! ALL PROCESSES: Wait for process 0 to print
! ---
call MPI_BARRIER(MPI_COMM_WORLD,ierr)
! ---
! ALL PROCESSES: Print out process number and number of successes
! ---
print '("Walks by process ",i1,":",i8)', process_num, n_success_proc
! ----
! ALL PROCESSES: COMPLETE MPI
! ----
call MPI_FINALIZE(ierr)

end program laplace_mc

! ---
! This module performs a random walk and also many random walks
! ---
module mc_walk
! ---
! ALL PROCESSES: Use random seed module & mpi
! ---
use mpi
use problem_description, only: uboundary, ax, ay, dx, dy, nx, ny
! ---
! ALL PROCESSES: nwalks & n_success_proc are module variables
! Example use: use mc_walk, only nwalks, n_success_proc
! ---
implicit none
integer :: n_walks, n_success_proc
real(kind=8), dimension(:), allocatable :: x
save

contains
! ---
! ALL PROCESSES: A single random walk
! Inputs: i0 (step in x), j0 (step in y), max_steps (maximum number of steps)
! Outputs: ub (value at boundary), iabort (0 for success, 1 for failure)
! Example use: random_walk(0, 0, 100, ub, iabort)
!
!    i:  0,  1,.. nx, nx+1
! j:
! 0,   (BC) (BC) (BC) (BC)
! 1,   (BC)           (BC)
! ...  (BC)           (BC)
! nj   (BC)           (BC)
! nj+1 (BC) (BC) (BC) (BC)
! ---
subroutine random_walk(i0, j0, max_steps, ub, iabort, debug_random)
! ---
! ALL PROCESSES: Declare parameters
! ---
implicit none
integer, intent(in) :: i0, j0, max_steps
logical, intent(in) :: debug_random
real(kind=8), intent(out) :: ub
integer, intent(out) :: iabort
integer :: i, j, istep, seed1, start
real(kind=8) :: xb, yb
! ---
! ALL PROCESSES: Initialize the walk from it's starting point
! ---
i = i0
j = j0
start = n_walks
! ---
! ALL PROCESSES: Go through the vector of random numbers and
! make moves depending on the value of x
! ---
do istep=start,max_steps
if (x(istep) .LT. 0.25) then
i = i-1    ! go left
if (debug_random) then
print *, "go left"
print *, "i=", i
print *, "istep=", istep
print *, "x=", x(istep)
endif
n_walks = n_walks + 1
else if (x(istep) .LT. 0.5) then
i = i+1    ! go right
if (debug_random) then
print *, "go right"
print *, "i=", i
print *, "istep=", istep
print *, "x=", x(istep)
endif
n_walks = n_walks + 1
else if (x(istep) .LT. 0.75) then
j = j-1    ! go down
if (debug_random) then
print *, "go up"
print *, "j=", j
print *, "istep=", istep
print *, "x=", x(istep)
endif
n_walks = n_walks + 1
else
j = j+1    ! go up
if (debug_random) then
print *, "go down"
print *, "j=", j
print *, "istep=", istep
print *, "x=", x(istep)
endif
n_walks = n_walks + 1
endif
! ---
! ALL PROCESSES: What if we are at the boundary?
! ---
if (i*j*(nx+1 - i)*(ny+1 - j) .EQ. 0) then
xb = ax + i*dx 			! x is ax or ax+i*dx
yb = ay + j*dy			! y is ay or ay+j*dy
ub = uboundary(xb, yb)	! the boundary value is then computed
iabort = 0				! success
n_success_proc = n_success_proc + 1
if (debug_random) then
print *, "Reached the boundary"
endif
go to 99				! end do loop
endif
! ---
! ALL PROCESSES: What if we never reach the boundary within max steps?
! ---
if (istep .EQ. max_steps) then
ub = 0					! set ub to zero
iabort = 1				! failure
n_success_proc = n_success_proc
if (debug_random) then
print *, "Failed to reach boundary"
endif
endif
enddo

99  continue

end subroutine random_walk
! ---
! ALL PROCESSES: A single random walk
! Inputs: i0 (step in x), j0 (step in y), max_steps (maximum number of steps)
! Outputs: ub (value at boundary), iabort (0 for success, 1 for failure)
! Example use: many_walks(0, 0, 100, ub, iabort)
! ---
subroutine many_walks(i0, j0, max_steps, n_mc, u_mc, n_success, debug_many)
! ---
! ALL PROCESSES: Declare parameters
! ---
implicit none
integer, intent(in) :: i0, j0, max_steps, n_mc
real(kind=8), intent(out) :: u_mc
integer, intent(out) :: n_success
logical, intent(in) :: debug_many
real(kind=8), allocatable, dimension(:) :: int_array
integer :: i, j, k, iabort, num_processes, process_num, &
ierr, m, r, sender, iabort_r, answer, num_sent, next_walk, jj
integer, dimension(MPI_STATUS_SIZE) :: status
logical :: debug_random
call MPI_COMM_SIZE(MPI_COMM_WORLD, num_processes, ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, process_num, ierr)

ub_sum = 0
n_success = 0
num_sent = 0
n_success_proc = 0
! ---
! MASTER PROCESS
! ---
if(process_num .EQ. 0) then
allocate(int_array(n_mc))	 ! to hold answer from Workers
! ---
! INITIALIZE THE PROCESS
! Send: 0, Tag: 1, To: Processes 1 to 3
! Increment number sent by 1
! ---
do m=1,num_processes-1
call MPI_SEND(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, &
m, 1, MPI_COMM_WORLD, ierr)
num_sent = num_sent + 1
if (debug_many) then
print '("Process ",i1," sent tag = ",i1, " to process ",i1)',process_num,1,m
endif
enddo
if (debug_many) then
print '("Number of values sent initially ",i1)',num_sent
endif
! ---
! RECIEVE ALL POSSIBLE CALLS TO RANDOM WALK
! Receive: answer_r, Tag: iabort (0 or 1), From: Any Process (1 to 3)
! Sender is now free
! ---
do m=1,n_mc
MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, status, ierr)
iabort_r = status(MPI_TAG)
sender = status(MPI_SOURCE)    ! Whichever process sent it is now free
! ---
! HAVE WE REACHED THE BOUNDARY?
! If the walk reached the boundary, iabort = 0, so add it to the array
! ---
if(iabort_r .EQ. 0) then
n_success = n_success + 1
else
int_array(m) = 0
n_success = n_success
endif
if (debug_many) then
print '("Process ",i1," recieved answer = ",es22.14," tag = ",i3, " from process ",i1)', &
process_num,int_array(m),iabort_r,sender
endif
! ---
! SEND ADDITIONAL CALLS TO RANDOM WALK
! If the number sent is less than the number of Monte Carlo points
! Send: 0, Tag: 1, To: Processes 1 to 3 (whichever is free)
! Increment number sent by 1
! ---
if (num_sent .LT. n_mc) then
call MPI_SEND(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, &
sender, 1, MPI_COMM_WORLD, ierr)
num_sent = num_sent + 1
if (debug_many) then
print '("Process ",i1," sent tag = ",i1, " to process ",i1)', &
process_num,1,sender
endif
! ---
! END WORKER PROCESSES
! Send: 0, Tag: 0, To: Processes 1 to 3 (whichever is free)
! ---
else
call MPI_SEND(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, &
sender, 0, MPI_COMM_WORLD, ierr)
endif
enddo
! ---
! SUM OVER THE ARRAY TO RETURN U_MC TO MANY_WALKS
! ---
ub_sum=sum(int_array) 		! Summation of the values from each process
u_mc = ub_sum/n_success		! Returned to many_walks
if (debug_many) then
print '("Number of values sent finally ",i10)', num_sent
endif
endif

! ---
! WORKER PROCESS
! ---
if(process_num .GT. 0) then

do while (.true.)
! ---
! RECIEVE CALLS FROM MASTER PROCESS
! Receive: 0, Tag: 1, From: Master (0)
! Sender is now free
! ---
call MPI_RECV(MPI_BOTTOM, 0, MPI_DOUBLE_PRECISION, &
MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD, status, ierr)
r = status(MPI_TAG)
if (debug_many) then
print '("Process ",i1," recieved tag = ",i1, " from process ",i1)',process_num,r,0
endif
! ---
! THERE IS ANOTHER WALK TO DO
! Call Random Walk again
! ---
if(r .EQ. 1) then
i = i0
j = j0
debug_random = .False.	! Change n_mc to 3 if debugging this
call random_walk(i, j, max_steps, ub, iabort, debug_random)
! ---
! SEND THE ANSWER TO THE MASTER
! Send: ub, Tag: iabort(0 or 1), To: Master (Process 0)
! ---
call MPI_SEND(ub, 1, MPI_DOUBLE_PRECISION, &
0, iabort, MPI_COMM_WORLD, ierr)
if (debug_many) then
print '("Process ",i1," sent answer = ",es22.14, " & tag = ",i3, " to process ",i1)', &
process_num,ub,iabort,0
endif
! ---
! THERE ISN'T ANOTHER WALK TO DO
! Worker finishes
! ---
else if(r .EQ. 0) then
if (debug_many) then
print '("Process ",i1," ended ")', process_num
endif
go to 100
endif
enddo
endif
! ----
! COMPLETE MPI
! ----
100 	continue ! Jumps here if the Worker finishes

end subroutine many_walks

end module mc_walk

! ---
! This module returns seed for the random_number function
! ---
module random_util

contains

subroutine init_random_seed(seed1)
! ---
! Declare the parameters
! ---
integer, intent(inout) :: seed1
integer :: clock
integer, dimension(:), allocatable :: seed
! ---
! Determine how many numbers needed to seed and allocate
! ---
call random_seed(size = nseed)
allocate(seed(nseed))
! ---
! If seed1 = 0 then set seed1 randomly using the system clock.
! This will give different sequences of random numbers
! from different runs, so results are not reproducible.
! ---
if (seed1 == 0) then
call system_clock(count = clock)
seed1 = clock
endif
! ---
! If seed1 > 0 then results will be reproducible since calling this
! twice with the same seed will initialize the random
! number generator so the same sequence is generated.
! Once seed1 is set, set the other elements of the seed array by adding
! multiples of 37 as suggested in the documentation.
! ---
do i=1,nseed
seed(i) = seed1 + 37*(i-1)
enddo
! ---
! Seed the generator
! ---
call random_seed(put = seed)
! ---
! Deallocate the seed
! ---
deallocate(seed)

end subroutine init_random_seed

end module random_util

program test1
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, n_walks, x
use random_util, only: init_random_seed
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy
real(kind=8) :: ub, rel_error, x0, y0
integer :: iabort, i, j, max_steps, seed1, n_success, i0, j0

! ---
! Number of steps is unknown, so we must allocate:
! ---
max_steps = 100*max(nx, ny)
allocate(x(max_steps)) ! random number array 0 to 1

! Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6

i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)

! shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy

u_true = utrue(x0,y0)

! ---
! Generate random number array x (0 to 1)
! ---
seed1 = 12345   	! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)

i = i0
j = j0

n_success = 0
print *, ""
print '(" True solution of PDE at x =",f10.4," y =",f10.4)', x0, y0
print '(" i =",i8," j =",i8)', i, j
print '(" u_true =   ",es13.3)', u_true
call random_walk(i, j, max_steps, ub, iabort)
print *, "ub=", ub
print *, "x=",x(1)
print *, "iabort=", iabort
print *, "nwalks=", n_walks

end program test1

program test1b
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, n_walks, x
use random_util, only: init_random_seed
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy
real(kind=8) :: ub, rel_error, x0, y0
integer :: iabort, i, j, max_steps, seed1, n_success, n_mc
!	integer, parameter :: nx = 19
!	integer, parameter :: ny = 11
! ---
! Set the number of samples for the Monte Carlo Solution
! ---
n_mc=10
! ---
! Number of steps is unknown, so we must allocate:
! ---
max_steps = 20*max(nx, ny)*n_mc
!	max_steps = 10
allocate(x(max_steps)) ! random number array 0 to 1
! ---
! Generate random number array x (0 to 1)
! ---
seed1 = 12345   	! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)
! Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6

i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)

! shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy

u_true = utrue(x0,y0)

i = i0
j = j0
! ---
! Record the number of times we successfully hit the boundary
! ---
n_success = 0
! ---
! Record the number of random walks we make
! ---
n_walks = 0
! ---
! Loop through and call a single random walk each time, indexing
! You must start the walk using the last successful walk point
! ---
do k=1,n_mc 0
call random_walk(i, j, max_steps, ub, iabort)
if(iabort .NE. 1) then
ub_sum = ub_sum + ub
n_success = n_success + 1
endif
u_mc = ub_sum/n_success
print *, ""
print *, "ub=", ub
!	print *, "x=",x(k)
print *, "iabort=", iabort
print *, "nwalks=", n_walks
print *, "n_success=", n_success
print *, "u_mc=", u_mc
enddo
end program test1b

program test2
! ---
! Uses random seed module
! ---
use mc_walk, only: random_walk, many_walks, n_walks, x
use random_util, only: init_random_seed
use problem_description, only: utrue, ax, ay, bx, by, nx, ny, dx, dy

real(kind=8) :: u_mc, rel_error, x0, y0
integer :: n_mc, n_success, i, j, max_steps, seed1, k, i0, j0
! ---
! Number of steps is unknown, so we must allocate
! 200000 is the maximum number of moves
! ---
n_mc = 10
max_steps = 200000*max(nx, ny)
!max_steps = 25
allocate(x(max_steps)) ! random number array 0 to 1
! Try it out from a specific (x0,y0):
x0 = 0.9
y0 = 0.6

i0 = nint((x0-ax)/dx)
j0 = nint((y0-ay)/dy)

! shift (x0,y0) to a grid point if it wasn't already:
x0 = ax + i0*dx
y0 = ay + j0*dy

u_true = utrue(x0,y0)

! ---
! Generate random number array x (0 to 1)
! ---
seed1 = 12345   	! seed1 = 12345 for repeatable seed
! seed1 = 0 for random seed
call init_random_seed(seed1)
call random_number(x)
! ---
! Set the point of interest
! ---
i = i0
j = j0
! ---
! Record the number of random walks we make
! ---
n_walks = 0
print *, ""
print '(" True solution of PDE at x =",f10.4," y =",f10.4)', x0, y0
print '(" u_true =   ",es13.3)', u_true
print *, ""
print *, "    Successes  Walks    Monte Carlo Solution    Relative Error"
do k=1,13
call many_walks(i, j, max_steps, n_mc, u_mc, n_success)
n_mc = 2 * n_mc
rel_error = abs(u_true-u_mc)/u_true
print '("  ",i8," ",i8,"      ",es13.3,"        ",es13.3)', &
n_success, n_walks, u_mc, rel_error
enddo

end program test2

from pylab import *

# read in three columns from file and unpack into 3 arrays:
`