2022-11-21

**EE 512 - Fall 2022 Class Project**

Solve two the following simulation problems in any computational language of your choice. Submit a report (strongly recommend either a Jupyter notebook or an R Markdown document) with your results, including all relevant captioned figures.

1. [Stochastic Control of Investment Portfolios]: An investor has two kinds of assets available to invest in: a risk-free asset Xt0 (safer) and a riskier asset Xt1 both modeled with Ito processes.

At time t, the investor is free to modify how much of their total wealth Zt is allocated to each asset. We use the time-varying policy (or decision) function ut ∈ [0, 1] to encode what fraction of Zt is in Xt0 vs. in Xt1. The investor cannot exceed or withhold any part of their wealth from investment (i.e. no borrowing or short selling).

This leads to the following expression for the investor’s wealth process:

Assume the investor acts on a monthly clock t for up to t = 500 months. (a) Identify conditions on ut that would make Zt an Ito process.

(b) Pick a few (e.g. 10) different risk profiles (a, b, α) for Zt. In real applications, typically

b < a. Assume the investor defaults to investing equally in both assets (i.e. ut = 0.5 ∀t). Plot ensembles of sample paths for Zt under your selected risk profiles (a, b, α).

(c) To evaluate the quality of a decision-making process, it is customary to estimate the average utility derived from following that decision process. Utility functions u(.) are typically concave increasing functions. For this problem, we use the logarithmic utility function u(z) = ln(z) (referred to as the Kelly Criterion). Use your ensembles of sample paths to estimate the average utility in the terminal month t = 500 derived from following default investment policy above ut = 0.5 ∀t. i.e. Derive Monte Carlo estimates for

(d) Derive Monte Carlo estimates for the average termninal utility derived from following the following alternative investment policy (also a constant fraction over time):

Compare the performance of both investment policies

(e) Do one of the following:

i. Compare the two investment policies under your chosen generalization of the control problem. E.g. modeling Xt1 with a jump processs, using a different (valid) utility function, adding more investment options, etc. OR

ii. Give a detailed outline of how you would apply statistical methods to learn optimal investment policies from data and experimentation (e.g. reinforcement learning). Be sure to characterize the strengths and weakness of your proposal.

2. [Options Pricing via Monte Carlo Estimation]: Generate T = 500-day sample paths of the geometric Brownian motion (GBM) for different values of (μ, σ) using the Euler-Maruyama simulation scheme. Recall that the GBM is the stochastic process that satisfies the following SDE:

dSt =μ(t,St)dt+σ(t,St)dBt =μ·Stdt+σ(t,St)·StdBt .

(a) Assume a GBM model for an underlying asset’s spot price St. Suppose we have a European call option for the asset St given a strike price of K. Write and test an evaluation function that takes a GBM sample path and K as input. The output is the payoff for the call option. Recall that European call options may only be exercised at the expiration time T = 500 and has the value Z

Z=g(ST)=max{ST −K,0}.

(b) Specify 5 sets of GBM parameters (μ,σ) for 5 different assets {Stk}k. Use Monte Carlo to estimate the expected value of European call options on each of these underlying assets. And provide confidence intervals for your estimates. You may make any simplifying assumptions

(e.g. constant asset volatilities) as long as you state your assumptions.

(c) Suppose the risk-free rate of return is r = 0.05. Modify your value estimation process to estimate the discounted expected payoff for the option.

(d) Replace the European call option with a Asian call option monitored every 50 days. This is a path-dependent option also exercised at the expiration time T but valued at ZAs

ZAs =gAs(ST)=max{S ̄−K,0} where S ̄ = Avg({S50,S100,··· ,ST})

Use Monte Carlo to estimate the value of Asian options on the same 5 sets of GBM-modeled assets you chose above. Compare the value of the European and Asian call options based on the results of your experiments.

(e) RepeattheEuropeanoptionvalueestimationprocessusingJump-Diffusionstochasticprocesses for the underlying asset prices instead of just a diffusion GBM process.

3. [Optimal City Paths]: The famous Traveling Salesman Problem (TSP) is an NP-hard routing problem. The time to find optimal solutions to TSPs grows exponentially with the size of the problem (number of cities). A statement of the TSP goes thus:

A salesman needs to visit each of N cities exactly once and in any order. Each city is connected to other cities via an air transportation network. Find a minimum length path on the network that goes through all N cities exactly once (an optimal Hamiltonian cycle).

A TSP solution is just an ordered list of the N cities with minimum path length. We will be exploring MCMC solutions to small and larger scale versions of the problem.

(a) Pick N = 10 2-D points in the [0, 1000] × [0, 1000] rectangle. These 2-D samples will represent the locations of N = 10 cities.

Write a function to capture the objective function of the TSP problem:

Start with a random path through all N cities ⃗c (a random permutation of the cities), an

initial high temperature T0, and a cooling schedule Tk = f(T0,k).

Randomly pick any two cities in your current path. Swap them. Use the difference

between the new and old path length to

calculate a Gibbs acceptance probability. Update the path accordingly.

Update your annealing temperature and repeat the previous city swap step. Run the simulated annealing procedure “to convergence.”

Plot the values of your objective function from each step. Plot your final TSP city tour.

(b) Run the Simulated Annealing TSP solver you just developed for N = {40, 400, 1000} cities. Explore the speed and convergence properties at these different problem sizes. You might want to play with the cooling schedules.