Aither CFDAn open source structured multi-block CFD code.
http://mnucci32.github.io/aither/
Sun, 27 Aug 2017 18:18:02 +0000Sun, 27 Aug 2017 18:18:02 +0000Jekyll v3.5.2Nonreflecting Outlet Boundary Condition<h2 id="subsonic-outflow">Subsonic Outflow</h2>
<p>At a subsonic outflow there are four characteristics leaving the domain and one characteristic entering the domain. This means that the flow state on the boundary can be calculated from the interior cell’s state for four of the five primative variables. Typically density and the three components of velocity are calculated from the interior cell’s state. This leaves pressure to specified using other information supplied by the user. This user supplied data is used with the incoming characteristic to determine the pressure on the boundary. The incoming 1D characteristic equation, with characteristic variable <script type="math/tex">w</script> is shown below.</p>
<script type="math/tex; mode=display">\frac{\partial w}{\partial t} = \frac{\partial u}{\partial t} - \frac{1}{\rho c} \frac{\partial p}{\partial t}</script>
<h2 id="standard-pressure-outlet">Standard Pressure Outlet</h2>
<p>With a standard pressure outlet, the implementation in most codes to to set the boundary pressure to a user defined value (shown below). This means that in the above characteristic equation <script type="math/tex">\frac{\partial p}{\partial t} = 0</script> which shows that there is a reflection of intensity <script type="math/tex">\frac{\partial u}{\partial t}</script>. This reflection can delay convergence as well as corrupt simulations where acoustic pressure fluctuations are important. Some such simulations would be aeroacoustic simulations and/or large eddy simulations.</p>
<script type="math/tex; mode=display">p_b = p_{ref}</script>
<h2 id="nonreflecting-outlet">Nonreflecting Outlet</h2>
<p>For a truly nonreflecting outlet <script type="math/tex">\frac{\partial w}{\partial t} = 0</script>. However using this boundary condition can cause difficulty with setting the pressure on the boundary. When this boundary condition is used, the boundary pressure tends to float [1]. For this reason a relaxation coefficient <script type="math/tex">\kappa</script> is typically used. Rudy and Strikwerda [2] suggested <script type="math/tex">\kappa</script> in the form shown below where <script type="math/tex">M_{max}</script> is the maximum Mach number on the boundary. The boundary condition proposed by Rudy and Strikwerda uses the locally one-dimensional inviscid (LODI) assumption. Many researchers have extended this approach to account for multi-dimensional flow at the outlet boundary by including the effect of transverse terms, <script type="math/tex">T</script> [3]. The nonreflecting boundary condition is shown below where superscripts represent the time level of the variables.</p>
<script type="math/tex; mode=display">p_{b}^{n+1} = \frac{ p^n + \rho^n c^n \left( \overrightarrow{v}^{n+1} - \overrightarrow{v}^{n} \right)
\cdot \overrightarrow{n} + \Delta t \kappa p_{ref} - \Delta t \beta T} {1 + \Delta t \kappa}</script>
<script type="math/tex; mode=display">\kappa = \frac{\sigma c^n \left( 1 - M_{max}^{2} \right)}{l}</script>
<p>The transverse terms are shown below. Here the subscript <script type="math/tex">t</script> represents the transverse direction, and the subscript <script type="math/tex">n</script> represents the boundary normal direction. For the <script type="math/tex">\beta</script> calculation, the Mach number used is the average Mach number on the boundary.</p>
<script type="math/tex; mode=display">\beta = M_{avg}</script>
<script type="math/tex; mode=display">T = -0.5 \left[ \overrightarrow{v}_{t}^{n} \cdot \left( \overrightarrow{\nabla}_{t} p^n - \rho^n c^2 \overrightarrow{\nabla}_{t} \overrightarrow{v}_{n}^{n} \right) + \gamma p^n \overrightarrow{\nabla}_{t} \cdot \overrightarrow{v}_{t}^{n} \right]</script>
<h2 id="convecting-lamb-oseen-vortex">Convecting Lamb-Oseen Vortex</h2>
<p>A common test case for nonreflecting boundary conditions is a vortex convecting through an outlet boundary [1]. Ideally the vortex leaves the domain and there are no pressure waves reflected back into the domain. With a standard pressure outlet implementation, this will not be the case. Nonreflecting boundary conditions can significantly reduce the reflections at the boundary.</p>
<p>A test case was added to Aither for a convecting vortex corresponding to Case C in [1]. This simulation involves <script type="math/tex">N_2</script> with a nominal pressure of 101300 Pascals and temperature of 288 Kelvin. The freestream flow is 100 meters per second with a Lamb-Oseen vortex with strength of 0.11 <script type="math/tex">\frac{m^2}{s}</script> centered at the middle of the domain superimposed on the freestream flow. The radius of the vortex is one tenth the length of the domain. Results for the standard outlet and nonreflecting outlet are shown below in terms of the nondimensional pressure <script type="math/tex">p^{*}</script>. Note that the <script type="math/tex">p^{*}</script> used in the plot is the opposite that used in [1]. This is because it is more intuitive that the vortex core be shown as having low pressure.</p>
<script type="math/tex; mode=display">p^{*} = \left(p - p_{ref} \right) \frac{2 R^2}{\rho \Gamma^2}</script>
<p><img src="/downloads/convectingVortex.gif" alt="ConvectingVortex" class="center-image" /></p>
<center>Comparison of standard pressure outlet with nonreflecting pressure outlet.</center>
<h2 id="references">References</h2>
<p>[1] Granet, Victor, et al. “Comparison of Nonreflecting Outlet Boundary Conditions for Compressible Solvers on Unstructured Grids”. 2010.</p>
<p>[2] Rudy, David and Strikwerda, John. “A Nonreflecting Outflow Boundary Condition for Subsonic Navier-Stokes Calculations”. Journal of Computational Physics. Vol 36, pp 55-70. 1980.</p>
<p><a href="http://www-personal.umich.edu/~hgim/PDF/CTM06-BC.pdf">[3]</a> Yoo, C. S. and Im, H. G. “Characteristic Boundary Conditions for Simulations of Compressible Reacting Flows with Multi-Dimensional, Viscous, and Reaction Effects”. June 29, 2006.</p>
Sun, 27 Aug 2017 11:00:00 +0000
http://mnucci32.github.io/aither/2017/08/27/nonreflecting-bcs.html
http://mnucci32.github.io/aither/2017/08/27/nonreflecting-bcs.htmlCFDAitherC++nonreflectingNRBCcharacteristicoutletVersion 0.7.0 Released<h2 id="release-notes">Release Notes</h2>
<p>The 0.7.0 release of Aither is available on
<a href="https://github.com/mnucci32/aither/releases">Github</a>. This release adds
a thermally perfect gas model, the ability to assign initial conditions from a
cloud of points, and the AUSMPW+ inviscid flux scheme. These options can be
invoked in the input file as shown below.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>thermodynamicModel: thermallyPerfect
initialConditions: <icState(tag=-1; file=cloudOfPoints.dat)>
inviscidFlux: ausm
</code></pre>
</div>
<h2 id="features-added">Features Added</h2>
<ul>
<li>Thermally perfect gas model</li>
<li>Ability to assign initial conditions from file</li>
<li>AUSMPW+ inviscid flux</li>
<li>Test cases
<ul>
<li>Thermally perfect supersonic flow over ramp</li>
<li>Uniform flow testing interblock orientations</li>
</ul>
</li>
</ul>
Tue, 13 Jun 2017 22:00:00 +0000
http://mnucci32.github.io/aither/2017/06/13/version-0-7-0.html
http://mnucci32.github.io/aither/2017/06/13/version-0-7-0.htmlCFDAitherC++thermally perfect gasinitial conditionsAUSMAUSMPW+Thermally Perfect Thermodynamic Model<h2 id="calorically-perfect-gas">Calorically Perfect Gas</h2>
<p>The default thermodynamic model in Aither is the calorically perfect gas model.
Calorically perfect gases have constant specific heats
<script type="math/tex">\left( c_p, c_v \right)</script>, and therefore a constant <script type="math/tex">\gamma</script>. In general
the calorically perfect gas model is a good assumption for air at lower
temperatures. The molecules of a calorically perfect gas are assumed to be rigid
so there is no vibrational component of internal energy. The equations below
show the implementation of the calorically perfect gas thermodynamic model in
Aither.</p>
<script type="math/tex; mode=display">e = e_{translational} + e_{rotational} + e_{vibrational} = c_v T</script>
<script type="math/tex; mode=display">e_{translational} = \left( n - 1 \right) R T
\,\,\,\,\,\,\,
e_{rotational} = R T
\,\,\,\,\,\,\,
e_{vibrational} = 0</script>
<script type="math/tex; mode=display">h = c_p T</script>
<script type="math/tex; mode=display">c_v = n R
\,\,\,\,\,\,\,
c_p = \left( n + 1 \right) R
\,\,\,\,\,\,\,
\gamma = \frac{c_p}{c_v} = \frac{1}{n} + 1</script>
<h2 id="thermally-perfect-gas">Thermally Perfect Gas</h2>
<p>As the temperature of a gas increases the vibrational modes of its molecules
are activated [1,2]. This means that some of the energy of the gas will move
into these vibrational modes instead of raising the temperature of the gas.
Therefore at higher temperatures where the vibrational modes are activated
a thermally perfect gas will have a cooler temperature than a calorically
perfect gas. The temperature at which the vibrational modes become significant
is determined by the vibrational temperature <script type="math/tex">T_v</script> of the gas. As the
temperature of the gas approaches this value the vibrational component of
internal energy becomes more and more significant.</p>
<p>The activiation of vibrational modes also means the the specific heats are no
longer constant. Instead they are assumed to be functions of temperature only.
The equations below show the implementation of the thermally perfect
thermodynamic model in Aither.</p>
<script type="math/tex; mode=display">e = e_{translational} + e_{rotational} + e_{vibrational} =
\int_0^T c_v \left( t \right) dt
\,\,\,\,\,\,\,
e_{vibrational} = \frac{R T_v}{e^{\frac{T_v}{T}} - 1}</script>
<script type="math/tex; mode=display">h = \int_0^T c_p \left( t \right) dt</script>
<script type="math/tex; mode=display">c_v = \frac{\partial e}{\partial t} =
R \left( n + \left[ \frac{\theta_v}{sinh \left( \theta_v \right)} \right]
^ 2 \right)
\,\,\,\,\,\,\,
c_p = \frac{\partial h}{\partial t} =
R \left( n + 1 + \left[ \frac{\theta_v}{sinh \left( \theta_v \right)} \right]
^ 2 \right)
\,\,\,\,\,\,\,
\theta_v = \frac{T_v}{2 T}</script>
<script type="math/tex; mode=display">\gamma = \frac{c_p \left( t \right)}{c_v \left( t \right)}</script>
<p>In Aither the thermally perfect thermodynamic model can be activated as shown
below.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>fluids: <fluid(name=air; n=2.5; molarMass=0.02897; vibrationalTemperature=3056)>
thermodynamicModel: thermallyPerfect
</code></pre>
</div>
<h2 id="example-problem">Example Problem</h2>
<p>An example problem of when the thermally perfect model is needed is now a part
of the test cases that come with the Aither repository. It is currently only
available on the <strong>develop</strong> branch, but will be available on <strong>master</strong> after
the next release. The test case involves hot supersonic flow over a 20 degree
ramp. The freestream conditions of the flow are Mach 3, static temperature of
2000 K, and static pressure of 229,600 Pa.</p>
<p><img src="/downloads/cpg_tpg.png" alt="ThermallyPerfect" class="center-image" /></p>
<center>Comparison of calorically perfect and thermally perfect thermodynamic
models.</center>
<p>The results show that behind the shocks the thermally perfect gas model predicts
a cooler flow than the calorically perfect gas model. This is expected as with
the thermally perfect model some of the energy goes into the vibrational modes
of the gas molecules.</p>
<h2 id="references">References</h2>
<p><a href="http://www.donnerflug.de/thesis/Lampe_MS_Thesis.pdf">[1]</a>
Lampe, Dietrich Rudolf. “Thermally Perfect, Calorically Imperfect
Taylor-Maccoll Flow”. 1994.</p>
<p><a href="https://www.amazon.com/Hypersonic-High-Temperature-Dynamics-Second-
Education/dp/1563477807/ref=asap_bc?ie=UTF8">[2]</a>
Anderson, John. “Hypersonic and High Temperature Gas Dynamics”. 2nd Edition.
AIAA. 2006.</p>
Sat, 20 May 2017 17:00:00 +0000
http://mnucci32.github.io/aither/2017/05/20/thermally-perfect.html
http://mnucci32.github.io/aither/2017/05/20/thermally-perfect.htmlCFDAitherC++thermally perfectcalorically perfectideal gasthermodynamicsVersion 0.6.0 Released<h2 id="release-notes">Release Notes</h2>
<p>The 0.6.0 release of Aither is available on
<a href="https://github.com/mnucci32/aither/releases">Github</a>. This release adds
periodic boundary conditions and wall functions for turbulent flow. Wall
variables can now be output to Plot3D meta files as well. Additional test cases
have been added for Couette flow and wall functions. The wall functions are
available for the k-<script type="math/tex">\omega</script> Wilcox, SST, SST-DES, and WALE turbulence models.
The wall functions are implemented in the way described by Nichols & Nelson [1].
If the first cell off the wall is in the log layer, the wall shear stress is
prescribed by the wall of the law.</p>
<script type="math/tex; mode=display">u^+ = \frac{1}{\kappa} ln \left( y^+ \right) + B</script>
<p>If the first cell off the wall results in a <script type="math/tex">y^+</script> less than 10, the
boundary condition automatically switches back to the low Reynolds number
formulation.</p>
<p>Wall variable output can be requested by assigning a list of variables to the
<strong>wallOutputVariables</strong> input. Wall functions can be turned on by setting the
<strong>wallTreatment</strong> option for a <strong>viscousWall</strong> to <strong>wallLaw</strong> as shown below.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>wallOutputVariables: <yplus, heatFlux, shearStress, frictionVelocity>
boundaryStates: <viscousWall(tag=1; wallTreatment=wallLaw)>
</code></pre>
</div>
<h2 id="features-added">Features Added</h2>
<ul>
<li>Periodic boundary conditions</li>
<li>Wall functions for turbulent flow</li>
<li>Wall variables output</li>
<li>Test cases
<ul>
<li>Couette flow</li>
<li>Flat plate wall functions</li>
</ul>
</li>
</ul>
<h2 id="references">References</h2>
<p>[1] Nichols, R. H. & Nelson, C. C.
“Wall Function Boundary Conditions Including Heat Transfer and Compressibility”.
AIAA Journal, Vol 42, No 6, June 2004.</p>
Mon, 01 May 2017 20:00:00 +0000
http://mnucci32.github.io/aither/2017/05/01/version-0-6-0.html
http://mnucci32.github.io/aither/2017/05/01/version-0-6-0.htmlCFDAitherC++couetteperiodicisothermalwall lawwall functionCouette Flow & New Boundary Conditions<h2 id="couette-flow">Couette Flow</h2>
<p><a href="https://en.wikipedia.org/wiki/Couette_flow">Couette flow</a> is viscous laminar flow
between two parallel plates, one of which is moving relative to the other. Due to its
simple nature and the existance of an analytical solution, it is a common validation
case for CFD codes. An example validation case is shown in Hirsch [1]. Couette flow
results in a constant shear stress which has a linear velocity profile and a parabolic
temperature profile as shown below.</p>
<script type="math/tex; mode=display">v(y) = \frac{y}{L} v_{wall}</script>
<script type="math/tex; mode=display">P_r E_c = \frac{\mu v^2_{wall}}{k \Delta T}</script>
<script type="math/tex; mode=display">T(y) = T_{low} + \Delta T \frac{y}{L} \left[1 + \frac{1}{2} P_r E_c \left(1 - \frac{y}{L} \right) \right]</script>
<p>Three of the newer features in Aither are periodic boundary conditions, moving walls,
and isothermal walls. A couette flow simulation can make use of all of these features,
so it makes a great addition to the test cases suite.</p>
<h2 id="problem-setup">Problem Setup</h2>
<p>The parallel plates are placed a distance of 0.001 meters apart. The bottom plate is
held at a temperature of 288 K and is stationary. The top plate is held at a temperature
of 289 K and is moving at 75.4 m/s. For this setup, the product of the Prandtl and Eckert
numbers is 4. This means that for the temperature profile, the maximum temperature will
not be at the plate, but in the flow instead. The exact solution dictates that the
maximum temperature should be three fourths of the way between the cold and hot plates.</p>
<p>The CFD domain is a rectangular prism with the top and bottom modeled as viscous walls,
the sides as slip walls, and the front / back as periodic. Isothermal walls can be
specified in Aither by adding the <strong>temperature</strong> parameter to the boundary state list.
Similarly moving walls can be specified by adding the <strong>velocity</strong> parameter. Periodic
boundary conditions are specified by indicating which boundary condition tags should be
paired as periodic. This is done through the <strong>startTag</strong> and <strong>endTag</strong> parameters. For
each periodic boundary condition a transformation must be done to get from one periodic
face to the other. Currently a translation can be specified by adding the <strong>translation</strong>
parameter which is a vector specifying how the boundary at the <strong>startTag</strong> should be
translated to get to the boundary at the <strong>endTag</strong>. Alternatively a rotation can be
specified by using the <strong>axis</strong>, <strong>point</strong>, and <strong>rotation</strong> parameters. The <strong>axis</strong>
parameter is a vector defining the axis of rotation. The <strong>point</strong> parameter is a vector
defining a point about which to rotate. The <strong>rotation</strong> parameter is a scalar defining
the rotation angle in radians. An example of these new boundary condition options is
shown below.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>boundaryStates: <periodic(startTag=4; endTag=5; translation=[0.01, 0, 0]),
viscousWall(tag=1; temperature=288),
viscousWall(tag=2; temperature=289; velocity=[75.4, 0, 0])>
</code></pre>
</div>
<h2 id="results-and-summary">Results and Summary</h2>
<p>The results from Aither show a linear velocity profile and a parabolic temperature
profile as expected. The results agree very well with the exact solution. These results
can be reproduced by running the <a href="https://github.com/mnucci32/aither">Aither</a> code. The
grid and input file for the couette flow case can be found in the <strong>testCases</strong> directory
of the repository.</p>
<p><img src="/downloads/couette.png" alt="Couette" class="center-image" /></p>
<center>Velocity and temperature profiles for Couette flow.</center>
<h2 id="references">References</h2>
<p><a href="https://www.amazon.com/Numerical-Computation-Internal-External-Flows/dp/0750665947">[1]</a>
Hirsch, Charles. “Numerical Computation of Internal and External Flows”. 2nd Edition.
Butterworth-Heinemann. 2006.</p>
Sat, 04 Mar 2017 17:00:00 +0000
http://mnucci32.github.io/aither/2017/03/04/couette-flow.html
http://mnucci32.github.io/aither/2017/03/04/couette-flow.htmlCFDAitherC++couetteperiodicisothermalSod's Shock Tube<h2 id="sods-shock-tube">Sod’s Shock Tube</h2>
<p>Sod’s shock tube [1] is a 1D canonical problem used to test the accuracy of CFD codes. The problem
consists of a fluid in a tube divided by a diaphragm. The fluid on the left side of the diaphragm
is at a high pressure, and the fluid on the right side of the diaphragm is at a lower pressure.
At time <em>t = 0</em>, the diaphragm is punctured and the fluid is allowed to mix. This results in a
right moving shock wave and contact discontinuity, and a left moving expansion wave. Numerically
this can be simulated by solving the Euler equations. The exact solution can be determined
analytically and used to compare to the CFD simulation result. Anderson’s book [2] describes
the process for computing the analytical solution.</p>
<p>Typically the flow variables are normalized by the high pressure state, so that the initial
conditions of the simulation are as shown.</p>
<script type="math/tex; mode=display">Q_l = \left[ \begin{array}{c}
\rho_l \\
v_{x_l} \\
v_{y_l} \\
v_{z_l} \\
P_l \\
\end{array} \right]
=
\left[ \begin{array}{c}
1.0 \\
0.0 \\
0.0 \\
0.0 \\
1.0 \\
\end{array} \right]
;
Q_r = \left[ \begin{array}{c}
\rho_r \\
v_{x_r} \\
v_{y_r} \\
v_{z_r} \\
P_r \\
\end{array} \right]
=
\left[ \begin{array}{c}
0.125 \\
0.0 \\
0.0 \\
0.0 \\
0.1 \\
\end{array} \right]</script>
<h2 id="reconstruction-schemes">Reconstruction Schemes</h2>
<p>In the cell-centered finite volume method, the volume averaged flow variables are stored at the
centroid of the cell. To calculate the fluxes at the cell faces, the flow variables are needed
at the cell faces. The solution therefore must be reconstructed from the cell centroids to the
cell faces. How this is done can greatly effect the accuracy of the simulation. The Sod’s shock
tube problem was run using three of the reconstruction methods available in Aither: <em>constant</em>,
<em>thirdOrder</em> (MUSCL), and <em>weno</em>.</p>
<h4 id="constant-reconstruction">Constant Reconstruction</h4>
<p>Constant reconstruction is a zeroth order reconstruction that results in a first order accurate
simulation. In this method the flow variables at the cell face are set equal to the flow
variables at the adjacent cell center. While very robust, this method is quite dissipative. It
typically results in a solution that is not accurate enough for engineering purposes, as
important flow features such as shocks are smeared.</p>
<h4 id="muscl-reconstruction">MUSCL Reconstruction</h4>
<p>MUSCL schemes were originally developed by van Leer. The stencil for this family of schemes uses
the flow variables at two cell centers upwind, and one downwind of the cell face. The MUSCL
schemes vary the weights of the flow variables in the stencil via a parameter <script type="math/tex">\kappa</script>. For
most values of <script type="math/tex">\kappa</script> the scheme results in a piecewise linear reconstruction which in turn
results in a second order accurate simulation. However, when <script type="math/tex">\kappa</script> is set equal to one third,
the scheme results in a piecewise parabolic reconstruction that is third order accurate. However,
due to the assumption that the flux is constant over the cell face, the simulation is still
second order accurate. The solution error for a simulation with <script type="math/tex">\kappa</script> equal to a third will
typically be lower than for other values of <script type="math/tex">\kappa</script>.</p>
<p>The MUSCL scheme by itself as with all higer order accurate schemes suffers from spurious
oscillations around discontinuities. In practice the reconstruction is limited through the use of
a slope limiter. This means that near discontinuities the order of accuracy of the reconstruction
is dropped to avoid the spurious oscillations.</p>
<h4 id="weno-reconstruction">WENO Reconstruction</h4>
<p>Weighted essentially non-oscillatory (WENO) schemes [3] were originally developed by Shu. The
stencil for this family of schemes uses the flow variables at three cell centers upwind, and two
downwind of the cell face. The WENO scheme uses the piecewise parabolic reconstruction of the
MUSCL scheme with <script type="math/tex">\kappa</script> equal to one third over three candidate substencils. The first of the
three substencils consists of the three upwind cells. The second consists of two upwind cells
and one downwind cell. The third substencil consists of one upwind cell and two downwind cells.
These three substencils are then weighted and combined to produce a fifth order accurate
reconstruction in smooth regions of the flow. In areas near discontinuties, the substencils
containing discontinuities are weighted to not contribute to the reconstruction which drops the
order of accuracy. Even though the reconstruction can be fifth order accurate, the simulation will
still be limited to second order accuracy due to the assumption of a constant flux on the cell
face.</p>
<h2 id="results">Results</h2>
<p>The results at nondimensionalized time <em>t = 0.1</em> are shown below. Near the discontinuities, the
excessive dissipation of the constant reconstruction can be seen. As expected, the MUSCL and
WENO schemes do much better.</p>
<p><img src="/downloads/sod.png" alt="Sod" class="center-image" /></p>
<center>Shock tube results for constant, MUSCL, and WENO reconstructions.</center>
<p>It is tough to tell the difference between the MUSCL and WENO results, so a zoomed in view of the
normalized density is shown below. In the picture below is can be seen that the WENO scheme does
slightly better in that it is a bit sharper near the discontinuities. This is due to its higher
order accuracy in the reconstruction.</p>
<p><img src="/downloads/sod_zoom.png" alt="Sod_Zoom" class="center-image" /></p>
<center>Detail view of density showing expansion, contact, and shock waves.</center>
<h2 id="summary">Summary</h2>
<p>The WENO scheme provides the most accurate simulation of Sod’s shock tube problem. The constant
reconstruction method provides the most dissipative solution. These results can be reproduced by
running the <a href="https://github.com/mnucci32/aither">Aither</a> code. The grid and input file for
the shock tube case can be found in the <strong>testCases</strong> directory of the repository. The python
script used to compare the results to the exact simulation can be found
<a href="https://github.com/mnucci32/SodShockTube">here</a>.</p>
<h2 id="references">References</h2>
<p>[1] Sod, G. A. “A Survey of Several Finite Difference Methods for Systems of Nonlinear Hyperbolic
Conservation Laws”, Journal of Computational Physics, Vol 27, pp 1-31. 1978.</p>
<p><a href="https://www.amazon.com/Modern-Compressible-Flow-Historical-Perspective/dp/0072424435">[2]</a>
Anderson, J. “Modern Compressible Flow with Historical Perspective”. McGraw-Hill Education, 2002.</p>
<p><a href="https://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19980007543.pdf">[3]</a> Shu, C. “Essentially
Non-Oscillatory and Weighted Essentially Non-Oscillatory Schemes for Hyperbolic Conservation
Laws”. NASA CR-97-206253. ICASE Report No. 97-65. 1997.</p>
Sun, 29 Jan 2017 17:00:00 +0000
http://mnucci32.github.io/aither/2017/01/29/sod-shock-tube.html
http://mnucci32.github.io/aither/2017/01/29/sod-shock-tube.htmlCFDAitherC++shock tubesodwenoweno-zmusclNew Input File Syntax: Vectors, States, & Lists<h2 id="new-input-file-syntax">New Input File Syntax</h2>
<p>There is a new input file syntax for Aither now in use in the <strong>develop</strong> branch of the code. This
syntax makes it easy to specify initial conditions by grid block, and boundary conditions by
boundary condtion tag. This is a huge upgrade in usability as it now allows for problems such as
Sod’s shock tube to be simulated. It also allows for easy implementation of various <strong>viscousWall</strong>
boundary conditions such as <em>adiabatic</em>, <em>isothermal</em>, and <em>constant heat flux</em>. The new input file
syntax is based off of three new objects (<strong>vectors</strong>, <strong>lists</strong>, & <strong>states</strong>) which will be
discussed in detail below.</p>
<h3 id="vectors">Vectors</h3>
<p>Vectors are now input in a comma separated list enclosed in brackets like below. Vectors must be
defined entirely on one line in the input file.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>velocityRef: [1.0, 0.0, 0.0]
</code></pre>
</div>
<p>Valid vector inputs have three components. If three components are not specified, Aither will throw
and error. Vector inputs are now used wherever vector quantities are needed such as for velocity
(above), or specifiying a direction as is done with the <strong>stagnationInlet</strong> boundary condition.</p>
<h3 id="states">States</h3>
<p>States are a group of properties that apply to an initial condition state or a boundary condtion
state. States are identified by name, enclosed in parenthesis, and individual properties within a
state are assigned with the equals operator and separated by semicolons. States must be defined
entirely on one line in the input file. Below is an example of <em>icState</em> which is used to specify
a flow state for initial condtions.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>icState(tag=0; pressure=101325; density=1.225; velocity=[100, 0, 0])
</code></pre>
</div>
<p>The supported properties depend on the type of state. There are optional turbulence properities
<em>tubulenceIntensity</em> and <em>eddyViscosityRatio</em> that may be specified for states used for inflow
boundary conditions or <em>icState</em>. If either of the optional turbulence properties are specified,
both must be specified. For <em>icState</em> the <em>tag</em> property is special. It refers to the block
number in which the <em>icState</em> will be applied. A value of -1 functions as the default state in
the event that there is not an <em>icState</em> with a tag pointing to a given block. An explicity
specified tag takes precedence over the default state. For example for a four block grid with
two <em>icState</em>s defined, one with a tag of -1, and another with a tag of 0, blocks 1-3 will use
the default <em>icState</em> with tag -1, and block 0 will use the <em>icState</em> with the tag of 0.</p>
<p>In addition to initial conditions, states are used for boundary conditions that may require
additional information. An example of each such boundary condition is shown below. For boundary
conditions, the tag property in each state refers to the boundary surface tag that is specified
in the boundary condition definition.</p>
<p>Inflow boundary conditions. These may optionally specify the turbulence properties.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>characteristic(tag=0; pressure=101325; density=1.225; velocity=[100, 0, 0])
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>stagnationInlet(tag=0; p0=101325; t0=300; direction=[1, 0, 0])
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>supersonicInflow(tag=0; pressure=101325; density=1.225; velocity=[100, 0, 0])
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>subsonicInflow(tag=0; density=1.225; velocity=[100, 0, 0])
</code></pre>
</div>
<p>Outflow boundary conditions.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>pressureOutlet(tag=0; pressure=101325)
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>subsonicOutflow(tag=0; pressure=101325)
</code></pre>
</div>
<p>Wall boundary condtions. One of <em>heatFlux</em> or <em>temperature</em> may be specified. The default behavior is zero
velocity and zero heat flux which corresponds to a stationary adiabatic wall.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>viscousWall(tag=0; heatFlux=100)
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>viscousWall(tag=0; temperature=400)
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>viscousWall(tag=0; velocity=[10, 0, 0])
</code></pre>
</div>
<h3 id="lists">Lists</h3>
<p>Lists are a comma separated group of properties that are enclosed in angle brackets. Lists may be specified
across multiple lines. Lists are most commonly used to specify the variables to output, the initial condition
states, and the boundary condition states. Examples are shown below.</p>
<div class="highlighter-rouge"><pre class="highlight"><code>outputVariables: <density, vel_x, vel_y, vel_z, pressure, temperature, mach>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>initialConditions: <icState(tag=0; pressure=101325; density=1.225; velocity=[0, 0, 0]),
icState(tag=1; pressure=10132.5; density=0.153125; velocity=[0, 0, 0])>
</code></pre>
</div>
<div class="highlighter-rouge"><pre class="highlight"><code>boundaryStates: <characteristic(tag=0; pressure=101325; density=1.225; velocity=[100, 0, 0]),
viscousWall(tag=1; velocity=[10, 0, 0])>
</code></pre>
</div>
<h2 id="summary">Summary</h2>
<p>The new input syntax in Aither is more intuitive and now allows for a wider variety of problems to
easily be simulated. All of the test cases in the <strong>develop</strong> branch have been updated to support this
new syntax. Grab the <strong>develop</strong> branch from Github and try it out today. This will be merging into the
<strong>master</strong> branch shortly.</p>
Mon, 26 Dec 2016 02:00:00 +0000
http://mnucci32.github.io/aither/2016/12/26/new-input-syntax.html
http://mnucci32.github.io/aither/2016/12/26/new-input-syntax.htmlCFDAitherC++developvectorsstateslistsinputsyntaxUsing Travis CI For Regression Tests<h2 id="why-use-a-continuous-integration-service">Why Use A Continuous Integration Service?</h2>
<p>Continuous integration services allow code updates to be built and tested on a variety of platforms. This saves the developer
a lot of time by not having to manually test the code. As Aither grows larger it becomes more and more beneficial to use a
continuous integration service. For example, say a more efficient way to calculate the inviscid flux was found, and a new
branch was created to refactor the invisicid flux code to use this new method. This change should result in the same solution,
but should take less time to complete. To be thorough, before merging the code back into the <strong>develop</strong> branch, unit tests
covering all of the code’s various functionality should be completed. These tests should still show that the solution is the
same as it was prior to the refactor. It can be tedious and time consuming to manually run these tests, not to mention
the tests should be run on different operating systems, and with different compilers as well. This is where continuous
integration saves the day! A continuous integration service will automatically build the most updated code on a variety of
operating systems with a variety of compilers, and can be made to run regression tests. This way it can easily be determined
if the refactor introduced any bugs.</p>
<h2 id="aithers-requirements-for-continuous-integration">Aither’s Requirements For Continuous Integration</h2>
<p>Ok, so it is clear that continuous integration is a good thing, but which service should be used? Ideally, a continuous integration
service would provide the following:</p>
<ul>
<li>Free (Aither is not a money making venture after all)</li>
<li>Testing on multiple operating systems (Aither is cross platform)</li>
<li>Testing with multiple compilers</li>
<li>Ability to use modern C++ (Aither uses C++14)</li>
<li>Support for required dependencies (Aither requires an MPI implementation and Cmake)</li>
<li>Ability to run regression tests in parallel</li>
<li>Easy to use within <a href="https://github.com">Github</a></li>
</ul>
<p>After a brief survery of available options, Aither recently started using <a href="https://travis-ci.org">Travis CI</a> for continuous
integration. Travis CI meets all of the above requirements. It is free for open source projects, widely used in the Github
community (i.e. <a href="https://github.com/su2code/SU2">SU2</a>), and supports builds on Ubuntu and macOS.</p>
<h2 id="using-travis-ci">Using Travis CI</h2>
<p>Once an account has been created with Travis CI it is easy to integrate with Github. All that is required is to add a <strong>.travis.yml</strong> file
to the repository. This file instructs Travis CI on how to build the code and run any regression tests. For Aither, a matrix of five
builds is setup (Ubuntu/gcc-5, Ubuntu/gcc-6, Ubuntu/clang, macOS/gcc-6, macOS/clang). These builds are setup under the <code class="highlighter-rouge">matrix</code> data
field of the <strong>.travis.yml</strong> file. An abbreviated build matrix is shown below; each of the builds is marked by the <code class="highlighter-rouge">- os:</code> line.</p>
<div class="language-yaml highlighter-rouge"><pre class="highlight"><code><span class="c1"># set up build matrix</span>
<span class="s">matrix</span><span class="pi">:</span>
<span class="s">include</span><span class="pi">:</span>
<span class="c1"># build for Ubuntu/gcc-6</span>
<span class="pi">-</span> <span class="s">os</span><span class="pi">:</span> <span class="s">linux</span>
<span class="s">dist</span><span class="pi">:</span> <span class="s">trusty</span>
<span class="s">sudo</span><span class="pi">:</span> <span class="s">required</span>
<span class="s">compiler</span><span class="pi">:</span> <span class="s">gcc</span>
<span class="c1"># add toolchains for newer, C++14 supporting gcc-6</span>
<span class="s">addons</span><span class="pi">:</span>
<span class="s">apt</span><span class="pi">:</span>
<span class="s">sources</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">ubuntu-toolchain-r-test</span>
<span class="s">packages</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">g++-6 gcc-6 libstdc++-6-dev</span>
<span class="c1"># change default compiler to newer gcc-6</span>
<span class="s">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">CXX_COMPILER=g++-6</span>
<span class="pi">-</span> <span class="s">C_COMPILER=gcc-6</span>
<span class="c1"># build for macOS/clang</span>
<span class="pi">-</span> <span class="s">os</span><span class="pi">:</span> <span class="s">osx</span>
<span class="s">osx_image</span><span class="pi">:</span> <span class="s">xcode8</span>
<span class="s">compiler</span><span class="pi">:</span> <span class="s">clang</span>
<span class="c1"># change defualt and homebrew compilers to clang</span>
<span class="s">env</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">CXX_COMPILER=clang++</span>
<span class="pi">-</span> <span class="s">C_COMPILER=clang</span>
<span class="pi">-</span> <span class="s">HOMEBREW_CC=clang</span>
<span class="pi">-</span> <span class="s">HOMEBREW_CXX=clang++</span>
</code></pre>
</div>
<h3 id="installing-mpi">Installing MPI</h3>
<p>During the build matrix setup, environment variables for the C/C++ compilers are changed to reflect the newer C++14 supporting
compiler to be used in the build. Travis CI only offers Ubuntu 14.04 as its newest linux offering. Since this version of the
operating systems is a few years old, updated compilers are needed for the latest C++ standard. However, when the Ubuntu
package manager is used, it installs binaries that were created with the system C/C++ compilers. For compatability purposes,
it would be best if Aither used a version of MPI that was compiled with the same compiler that will be used to compile
Aither itself. For this reason OpenMPI is compiled from source using the updated compilers.</p>
<p>For macOS, things are a little different. The macOS virtual machines from Travis CI come preinstalled with the
<a href="http://brew.sh">homebrew</a> package manager. With homebrew the <code class="highlighter-rouge">HOMEBREW_CC</code> and <code class="highlighter-rouge">HOMEBREW_CXX</code> environment variables
control the compiler that new packages are built with. This means that installing MPI is easier because the package
manager can do it automatically.</p>
<p>This means that the <strong>.travis.yml</strong> script has to tell Travis CI to install MPI in a different way depending on which
operating system the build is happening on. This can easily be done with a simple bash script as shown below.</p>
<div class="language-bash highlighter-rouge"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
<span class="c"># for macOS builds use OpenMPI from homebrew</span>
<span class="k">if</span> <span class="o">[</span> <span class="s2">"</span><span class="nv">$TRAVIS_OS_NAME</span><span class="s2">"</span> <span class="o">==</span> <span class="s2">"osx"</span> <span class="o">]</span>; <span class="k">then
</span><span class="nb">cd </span>openmpi
<span class="c"># check to see if OpenMPI is cached from previous build</span>
<span class="k">if</span> <span class="o">[</span> -f <span class="s2">"bin/mpirun"</span> <span class="o">]</span>; <span class="k">then
</span><span class="nb">echo</span> <span class="s2">"Using cached OpenMPI"</span>
<span class="k">else
</span><span class="nb">echo</span> <span class="s2">"Installing OpenMPI with homebrew"</span>
<span class="nv">HOMEBREW_TEMP</span><span class="o">=</span><span class="nv">$TRAVIS_BUILD_DIR</span>/openmpi
brew install open-mpi
<span class="k">fi
else</span>
<span class="c"># for Ubuntu builds install OpenMPI from source</span>
<span class="c"># check to see if OpenMPI is cached from previous build</span>
<span class="k">if</span> <span class="o">[</span> -f <span class="s2">"openmpi/bin/mpirun"</span> <span class="o">]</span> <span class="o">&&</span> <span class="o">[</span> -f <span class="s2">"openmpi-2.0.1/config.log"</span> <span class="o">]</span>; <span class="k">then
</span><span class="nb">echo</span> <span class="s2">"Using cached OpenMPI"</span>
<span class="nb">echo</span> <span class="s2">"Configuring OpenMPI"</span>
<span class="nb">cd </span>openmpi-2.0.1
./configure --prefix<span class="o">=</span><span class="nv">$TRAVIS_BUILD_DIR</span>/openmpi <span class="nv">CC</span><span class="o">=</span><span class="nv">$C_COMPILER</span> <span class="nv">CXX</span><span class="o">=</span><span class="nv">$CXX_COMPILER</span> &> openmpi.configure
<span class="k">else</span>
<span class="c"># install OpenMPI from source</span>
<span class="nb">echo</span> <span class="s2">"Downloading OpenMPI Source"</span>
wget https://www.open-mpi.org/software/ompi/v2.0/downloads/openmpi-2.0.1.tar.gz
tar zxf openmpi-2.0.1.tar.gz
<span class="nb">echo</span> <span class="s2">"Configuring and building OpenMPI"</span>
<span class="nb">cd </span>openmpi-2.0.1
./configure --prefix<span class="o">=</span><span class="nv">$TRAVIS_BUILD_DIR</span>/openmpi <span class="nv">CC</span><span class="o">=</span><span class="nv">$C_COMPILER</span> <span class="nv">CXX</span><span class="o">=</span><span class="nv">$CXX_COMPILER</span> &> openmpi.configure
make -j4 &> openmpi.make
make install &> openmpi.install
<span class="nb">cd</span> ..
<span class="k">fi</span>
<span class="c"># recommended by Travis CI documentation to unset these for MPI builds</span>
<span class="nb">test</span> -n <span class="nv">$CC</span> <span class="o">&&</span> <span class="nb">unset </span>CC
<span class="nb">test</span> -n <span class="nv">$CXX</span> <span class="o">&&</span> <span class="nb">unset </span>CXX
<span class="k">fi</span>
</code></pre>
</div>
<h3 id="caching-dependencies">Caching Dependencies</h3>
<p>Builds can be sped up on Travis CI by caching dependencies. Aither depends on MPI which can take a while to build from
source. However, this really only needs to be done once if the MPI installation can be cached and retrieved from build
to build. Fortunately, Travis CI allows this capability even for their free tier of services. To cache the MPI install
directory is simple. Only the following few lines need to be added to the <strong>.travis.yml</strong> file. This caches the OpenMPI
source code directory, as well as the installation directory.</p>
<div class="language-yaml highlighter-rouge"><pre class="highlight"><code><span class="s">cache</span><span class="pi">:</span>
<span class="s">directories</span><span class="pi">:</span>
<span class="pi">-</span> <span class="s">openmpi</span>
<span class="pi">-</span> <span class="s">openmpi-2.0.1</span>
</code></pre>
</div>
<h2 id="regression-tests">Regression Tests</h2>
<p>Once the build completes Travis CI will run the Aither regression tests. The regression tests are located in the <strong>testCases</strong>
directory of the repository. Travis CI will run each case for 100 iterations and compare the residuals to some “truth” values.
If the residuals differ by less than a given ammount (1% for Aither), the test passes. The idea is that the regression tests
cover most or all of the code’s functionality. On Ubuntu builds there are two processors available, so the tests are run in
parallel. On macOS there is only one processor available, so the tests are run in serial. The Aither repository includes a
python script to automate the running of these regression tests. After Travis CI builds the code, the script is invoked to
run the tests.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Travis CI is now used by Aither to test builds on Ubuntu and macOS using gcc-5, gcc-6, and clang. Regression tests are run for
all of Aither’s test cases to ensure that no existing functionality is broken with changes to the code. For more information on
how the whole thing is set up, visit the <a href="https://github.com/mnucci32/aither">repository</a> and check out the <strong>.travis.yml</strong> and
<strong>travis/installMPI</strong> files.</p>
Sat, 03 Dec 2016 12:00:00 +0000
http://mnucci32.github.io/aither/2016/12/03/using-travisci.html
http://mnucci32.github.io/aither/2016/12/03/using-travisci.htmlCFDAitherC++v0.4.0traviscontinuous integrationtravisciregressionC++11C++14CmakeEvaluation of Eigen<h2 id="should-aither-use-eigen">Should Aither Use Eigen?</h2>
<p>Aither, like many CFD codes requires matrix and vector operations when calculating the flow solution. When using block implicit
methods, Aither uses 5x5 matrices for the flow equations and 2x2 matrices if a turbulence model is selected. These matrices
are mutiplied with other matrices, multiplied with vectors, scaled by scalar values, added, subtracted, and inverted. If these
operations could be performed more efficiently, it would result in a great performance improvement. There are many third party
linear algebra libraries available that could be used by Aither such as <a href="http://eigen.tuxfamily.org/index.php?title=Main_Page">Eigen</a>,
<a href="http://arma.sourceforge.net/">Armadillo</a>, and <a href="https://www.mcs.anl.gov/petsc/">PETSc</a>. PETSc is widely used and has the ability
to run in parallel. It contains many robust matrix solvers as well. However it is written in C, must be linked to, and is probably
overkill for small matrix/vector operations as described above. Eigen is also widely used and writen in C++. It has the added
advantage of being entirely header-only, so there is no need to link to anything. It could therefore be distributed with the Aither
source code eliminating the issue of findind a dependency on various computer systems. Armadillo is another linear algebra library
written in C++. It links with libraries such as LAPACK, OpenBLAS, MKL, or ATLAS. Since Eigen claims comprable or better performance
than many linear algebra libraries, and it has the best ease of use, it was choosen for evalution. Armadillo may be evaluated at
a later point.</p>
<h2 id="test-cases">Test Cases</h2>
<p>The following five test cases were used to evaluate Eigen versus the linear algebra code already in Aither.</p>
<ul>
<li>Matrix-matrix multiplication <script type="math/tex">\left( A B = C \right)</script></li>
<li>Matrix-vector multiplication <script type="math/tex">\left( A \vec{x} = \vec{b} \right)</script></li>
<li>Matrix multiplication with a scalar and addition <script type="math/tex">\left( A s + B = C \right)</script></li>
<li>Vector multiplication with a scalar and addition <script type="math/tex">\left( \vec{x} s + \vec{y} = \vec{z} \right)</script></li>
<li>Matrix inverse <script type="math/tex">\left( A^{-1} = B \right)</script></li>
</ul>
<p>The above tests were repeated 10 million times each using 5x5 and 2x2 matrices. Eigen has predefined classes for small (< 4)
matrices and vectors which statically allocate their memory on the stack. These can be quite a bit faster than the more general n-dimensional
matrices and vectors which dynamically allocate their memory on the heap. Aither requires general sized matrices because the matrix size is
determined at run time. For scalar implicit methods like LU-SGS and DPLUR the matrix size is 1. For block implicit methods like
BLU-SGS and BDPLUR the matrix size is 5. For the tests using the 2x2 matrices both the static <code class="highlighter-rouge">Eigen::Matrix2d</code> and dynamic <code class="highlighter-rouge">Eigen::MatrixXd</code>
versions were used, but the comparison to Aither was made using the heap version. Aither uses dynamic allocation for all matrices
<code class="highlighter-rouge">squareMatrix(5)</code>, <code class="highlighter-rouge">squareMatrix(2)</code>, but uses static allocation for vectors <code class="highlighter-rouge">genArray</code>. In Aither, all vectors are of size 7 which is the
maximum number of equations solved.</p>
<h2 id="results">Results</h2>
<p>The timing results for each of the tests are shown below in seconds. There is no data for the vector multiplication with a scalar and addtion test for Aither
for the size 2 vector because in Aither all vectors are the same length. Therefore the test would be the same as for the larger vector.</p>
<table>
<thead>
<tr>
<th>Matrix Type</th>
<th>Size</th>
<th>Matrix-Matrix Multiplication</th>
<th>Matrix-Vector Multiplication</th>
<th>Matrix Scale & Addition</th>
<th>Vector Scale & Addition</th>
<th>Matrix Inverse</th>
</tr>
</thead>
<tbody>
<tr>
<td><code class="highlighter-rouge">Eigen::MatrixXd</code></td>
<td>5x5</td>
<td>2.87540</td>
<td>1.04917</td>
<td>2.30247e-1</td>
<td>4.68587e-2</td>
<td>9.86778</td>
</tr>
<tr>
<td><code class="highlighter-rouge">Eigen::Matrix2d</code></td>
<td>2x2</td>
<td>6.05223e-3</td>
<td>3.09800e-3</td>
<td>4.20000e-8</td>
<td>4.70000e-8</td>
<td>3.15865e-3</td>
</tr>
<tr>
<td><code class="highlighter-rouge">Eigen::MatrixXd</code></td>
<td>2x2</td>
<td>1.81787</td>
<td>6.09033e-1</td>
<td>2.65002e-1</td>
<td>1.67215e-2</td>
<td>3.70098</td>
</tr>
<tr>
<td><code class="highlighter-rouge">squareMatrix</code></td>
<td>5x5</td>
<td>1.91885</td>
<td>2.39314e-1</td>
<td>1.81551</td>
<td>2.13851e-2</td>
<td>6.99836</td>
</tr>
<tr>
<td><code class="highlighter-rouge">squareMatrix</code></td>
<td>2x2</td>
<td>5.08120e-1</td>
<td>1.09769e-1</td>
<td>9.79889e-1</td>
<td>N/A</td>
<td>1.45957</td>
</tr>
</tbody>
</table>
<h2 id="conclusions">Conclusions</h2>
<p>Surprisingly Aither outperforms Eigen in all tests with the exception of the matrix multiplication with a scalar and addition test. The statically allocated
Eigen matrix and vector classes far outperform both the dynamically allocated Eigen classes and the Aither classes. This is expected as memory access is faster
to the stack than it is to the heap. Looking at the Eigen <a href="http://eigen.tuxfamily.org/index.php?title=Benchmark">benchmarks</a>, the best performance is expected
for larger matrices than the ones tested here. The results of these tests indicate that it would not be worthwhile to integrate Eigen into Aither. The source
code for these tests can be found <a href="https://github.com/mnucci32/eigenVsAither">here</a>.</p>
Sat, 15 Oct 2016 12:00:00 +0000
http://mnucci32.github.io/aither/2016/10/15/evaluation-of-eigen.html
http://mnucci32.github.io/aither/2016/10/15/evaluation-of-eigen.htmlCFDAitherC++v0.4.0Eigenmatrixlinear algebrablaslinpacklapackOpenBLASATLASintel MKLLU-SGS versus DPLUR<h2 id="comparison-of-implicit-methods-available-in-aither">Comparison of Implicit Methods Available In Aither</h2>
<p>Aither v0.3.0 contains four implicit methods for solving the system of equations selected. Lower-Upper Symmetric Gauss Seidel (LU-SGS)
<a href="http://aero-comlab.stanford.edu/Papers/AIAA-10007-471.pdf">[1]</a>,
Block Lower-Upper Symmetric Gauss Seidel (BLU-SGS) <a href="http://www.dept.ku.edu/~cfdku/papers/2000-AIAAJ.pdf">[2]</a>,
Data Parallel Lower Upper Relaxation (DPLUR) <a href="https://www.researchgate.net/publication/265377421_A_data-parallel_LU-SGS_method_for_reacting_flows">[3]</a>,
and Block Data Parallel Lower Upper Relaxation (BDPLUR)
<a href="https://www.researchgate.net/publication/245424386_A_data-parallel_LU_relaxation_method_for_the_Navier-Stokes_equations">[4]</a>.
These four methods are similar in nature but have different performance characteristics depending on the
problem at hand. At their core they all involve solving a system of equations. We start with the implicit discretization cast in
delta form as shown below where <script type="math/tex">\Delta X^n = X^{n+1} - X^n</script>.</p>
<script type="math/tex; mode=display">\frac{V}{\Delta t} \Delta W^n + = -R^{n+1}</script>
<script type="math/tex; mode=display">R^{n+1} = R^n + \frac{\partial R^n}{\partial W^n} \Delta W^n + ...</script>
<script type="math/tex; mode=display">\frac{V}{\Delta t} \left[ \frac{\partial R^n}{\partial W^n} \right] \Delta W^n = -R^n</script>
<p>In the above equation <script type="math/tex">\frac{\partial R^n}{\partial W^n}</script> is a <script type="math/tex">N</script> x <script type="math/tex">N</script> block matrix where each block is <script type="math/tex">n</script> x <script type="math/tex">n</script>.
<script type="math/tex">N</script> is the number of cells in the domain and <script type="math/tex">n</script> is the number of equations being solved in each cell. <script type="math/tex">\Delta W^n</script> and <script type="math/tex">-R^n</script>
are <script type="math/tex">N</script> x 1 block vectors where each block is of size <script type="math/tex">n</script> x 1. As you can see solving the Navier-Stokes equations implicitly boils
down to solving the cannonical linear algebra problem <script type="math/tex">A x = b</script>. All four of the implicit methods within Aither start from this
discretization. They differ only in the approximations made in constructing the matrix <script type="math/tex">A</script>, and the
solution method. All four methods approximately solve <script type="math/tex">A x = b</script>, where LU-SGS and BLU-SGS use the Gauss Seidel method and DPLUR and
BDPLUR use the Jacobi method. Since <script type="math/tex">A x = b</script> is being solved approximately, there is no reason to waste computational expense to
accurately calculate <script type="math/tex">A</script>. Since <script type="math/tex">A</script> is being approximately calculated it is factored into lower trianguler, diagonal, and
upper trianguler matrices.</p>
<script type="math/tex; mode=display">A \approx L + D + U</script>
<p>All of the methods make varying approximations in constructing the implicit matrix. The LU-SGS and DPLUR
methods approximate the <script type="math/tex">L</script>, <script type="math/tex">D</script>, and <script type="math/tex">U</script> matricies with their spectral radii, while the BLU-SGS and BDPLUR methods use the full
<script type="math/tex">n</script> x <script type="math/tex">n</script> matrix on the diagonal. The approximations made by LU-SGS and DPLUR allow them to use less memory as they do not need to
store a full matrix for <script type="math/tex">D</script>, but only a scalar value. They are also more computationally efficient as the inversion of the <script type="math/tex">D</script>
matrix is trivial. The BLU-SGS and DPLUR methods make less approximations and therefore should converge in fewer iterations. However,
they are more computationally expensive and require more memory. Using the full matrix on the diagonal hurts the diagonal dominance
of the linear system making these methods less stable. They may require a lower CFL number than their scalar diagonal counterparts.</p>
<h2 id="supersonic-wedge">Supersonic Wedge</h2>
<p>To compare these implicit methods the simulation of supersonic turbulent flow over a 15 degree wedge was used. Freestream conditions
of 23842.3 <script type="math/tex">Pa</script>, 0.379597 <script type="math/tex">\frac{kg}{m^3}</script>, and 739.9 <script type="math/tex">\frac{m}{s}</script> were used. The Reynolds-Averaged Navier-Stokes equations
were solved with the <a href="https://turbmodels.larc.nasa.gov/wilcox.html">Wilcox <script type="math/tex">k-\omega</script> 2006</a> turbulence model. The block matrix
methods became unstable when the default freetream eddy viscosity ratio of ten was used, so for these simulations all methods used
0.001. The grid used was 101 x 121 x 2 with near wall spacing of <script type="math/tex">1.0e^{-6} m</script> to ensure a <script type="math/tex">y^+</script> value less than one. The
simulations were solved with second order accuracy in space using MUSCL reconstruction with Aither’s <em>thirdOrder</em> option. The
<em>minmod</em> limiter was used to avoid spurious oscillations near the shock. Contours of Mach number and turbulent eddy viscosity are
shown below.</p>
<p><img src="/downloads/turbWedgeMach.png" alt="Mach" class="center-image" /></p>
<center>Mach contour of supersonic flow over wedge.</center>
<p><img src="/downloads/turbWedgeEddyVisc.png" alt="Turbulent Eddy Viscosity Ratio" class="center-image" /></p>
<center>Turbulent eddy viscosity ratio contour of supersonic flow over wedge.</center>
<h2 id="implementation-of-implicit-methods">Implementation Of Implicit Methods</h2>
<p>The implicit methods differ primarily in their construction of the <script type="math/tex">D</script>, <script type="math/tex">L</script>, and <script type="math/tex">U</script> matrices. Since the <script type="math/tex">L</script> and <script type="math/tex">U</script>
matrices are constructed in a similar was as the <script type="math/tex">D</script> matrix, only the <script type="math/tex">D</script> matrix will be shown here. The LU-SGS and DPLUR
methods use an identical <script type="math/tex">D</script> matrix. The BLU-SGS and BDPLUR methods use an indentical <script type="math/tex">D</script> matrix as well. In all methods
the <script type="math/tex">D</script> matrix is the only one that is stored, while the <script type="math/tex">L</script> and <script type="math/tex">U</script> matrices are calculated on-the-fly. The mean flow
and turbulence equations are handled separately, so two scalars are used for <script type="math/tex">D</script> in the LU-SGS and DPLUR methods. For
BLU-SGS and BDPLUR a 5 x 5 matrix and a 2 x 2 matrix (for a two equation turbulence model) are used. The inviscid flow jacobian
follows the derivation in <a href="http://www.amazon.com/Computational-Fluid-Dynamics-Principles-Applications/dp/0080445063">Blazek</a>,
and the viscous flow jacobian uses the thin shear layer approximation and the derivation shown in
<a href="https://aerodynamics.lr.tudelft.nl/~rdwight/pub/rdwight-PhDThesis-ImplicitAndAdjoint.pdf">Dwight</a>.
The following definitions are used in the equations below:
<script type="math/tex">\phi = 0.5 \left( \gamma - 1 \right) \vec{v} \cdot \vec{v}</script>, <script type="math/tex">v_n = \vec{v} \cdot \vec{n}</script>,
<script type="math/tex">a_1 = \gamma E - \phi</script>, and <script type="math/tex">a_3 = \gamma - 2</script></p>
<script type="math/tex; mode=display">D_{scalar} = \lambda_i + \lambda_v + \lambda_s</script>
<script type="math/tex; mode=display">\begin{equation} \lambda_{i_{flow}} = 0.5 A \left(\left| \vec{v} \cdot \vec{n} \right| + a \right)
\qquad
\lambda_{i_{turb}} = 0.5 A \left(\left| \vec{v} \cdot \vec{n} \right| + \vec{v} \cdot \vec{n} \right) \end{equation}</script>
<script type="math/tex; mode=display">\begin{equation} \lambda_{v_{flow}} = \frac{A}{\Delta x} \left( \frac{\mu}{Pr} + \frac{\mu_t}{Pr_t} \right) max\left( \frac{4}{3 \rho}, \frac{\gamma}{\rho}\right)
\qquad
\lambda_{v_{turb}} = \frac{A}{\Delta x} \frac {\mu + \sigma_k \mu_t}{\rho} \end{equation}</script>
<script type="math/tex; mode=display">\begin{equation} \lambda_{s_{flow}} = 0
\qquad
\lambda_{s_{turb}} = -2 \beta^* \omega V \end{equation}</script>
<hr />
<script type="math/tex; mode=display">D_{block} = \frac{\partial F_i}{\partial W} - \frac{\partial F_v}{\partial W} - \frac{\partial S}{\partial W}</script>
<script type="math/tex; mode=display">% <![CDATA[
\frac{\partial F_{i_{flow}}}{\partial W} = \left[ \begin{array}{ccccc}
0 & n_x & n_y & n_z & 0 \\
\phi n_x - v_x v_n & v_n - a_3 n_x v_x & v_x n_y - \left(\gamma - 1 \right) v_y n_x & v_x n_z - \left(\gamma - 1 \right) v_z n_x & \left(\gamma - 1 \right) n_x \\
\phi n_y - v_y v_n & v_y n_x - \left(\gamma - 1 \right) v_x n_y & v_n - a_3 n_y v_y & v_y n_z - \left(\gamma - 1 \right) v_z n_y & \left(\gamma - 1 \right) n_y \\
\phi n_z - v_z v_n & v_z n_x - \left(\gamma - 1 \right) v_x n_z & v_z n_y - \left(\gamma - 1 \right) v_y n_z & v_n - a_3 n_z v_z & \left(\gamma - 1 \right) n_z \\
v_n \left(\phi - a_1 \right) & a_1 n_x - \left(\gamma - 1 \right) v_x v_n & a_1 n_y - \left(\gamma - 1 \right) v_y v_n & a_1 n_z - \left(\gamma - 1 \right) v_z v_n & \gamma v_n
\end{array} \right] %]]></script>
<script type="math/tex; mode=display">% <![CDATA[
\frac{\partial F_{i_{turb}}}{\partial W} = \left[ \begin{array}{cc}
0.5 A \left(\left| \vec{v} \cdot \vec{n} \right| + \vec{v} \cdot \vec{n} \right) & 0 \\
0 & 0.5 A \left(\left| \vec{v} \cdot \vec{n} \right| + \vec{v} \cdot \vec{n} \right)
\end{array} \right] %]]></script>
<script type="math/tex; mode=display">% <![CDATA[
\frac{\partial F_{v_{flow}}}{\partial W} = \mp \frac{A \left(\mu + \mu_t \right)}{\Delta x} \left[ \begin{array}{ccccc}
0 & 0 & 0 & 0 & 0 \\
0 & \frac{1}{3} n^2_x + 1 & \frac{1}{3} n_y n_x & \frac{1}{3} n_z n_x & 0 \\
0 & \frac{1}{3} n_x n_y & \frac{1}{3} n^2_y + 1 & \frac{1}{3} n_z n_y & 0 \\
0 & \frac{1}{3} n_x n_z & \frac{1}{3} n_y n_z & \frac{1}{3} n^2_z + 1 & 0 \\
\psi^{\pm}_{\rho} & \mp \frac{\Delta x}{2 \left(\mu + \mu_t\right)} n_l \tau_{lx}+ \pi_x & \mp \frac{\Delta x}{2 \left(\mu + \mu_t\right)} n_l \tau_{ly}+ \pi_y & \mp \frac{\Delta x}{2 \left(\mu + \mu_t\right)} n_l \tau_{lz}+ \pi_z & \psi^{\pm}_{p}
\end{array} \right] \cdot
\left[ \begin{array}{ccccc}
1 & 0 & 0 & 0 & 0 \\
-\frac{v_x}{\rho} & \frac{1}{\rho} & 0 & 0 & 0 \\
-\frac{v_y}{\rho} & 0 & \frac{1}{\rho} & 0 & 0 \\
-\frac{v_z}{\rho} & 0 & 0 & \frac{1}{\rho} & 0 \\
\phi & -\left( \gamma - 1\right) v_x & -\left( \gamma - 1\right) v_y & -\left( \gamma - 1\right) v_z & \gamma - 1
\end{array} \right] %]]></script>
<script type="math/tex; mode=display">\begin{equation} \psi^+_{\rho} = -\frac{\left(\kappa + \kappa_t \right) T_l}{\left(\mu + \mu_t \right) \rho_l}
\qquad
\psi^-_{\rho} = -\frac{\left(\kappa + \kappa_t \right) T_r}{\left(\mu + \mu_t \right) \rho_r}
\qquad
\psi^+_{p} = -\frac{\left(\kappa + \kappa_t \right)}{\left(\mu + \mu_t \right) \rho_l}
\qquad
\psi^-_{p} = -\frac{\left(\kappa + \kappa_t \right)}{\left(\mu + \mu_t \right) \rho_r}
\end{equation}</script>
<script type="math/tex; mode=display">\begin{equation} \pi_x = \left(\frac{1}{3} n^2_x + 1\right) v_x + \frac{1}{3} n_y n_x v_y + \frac{1}{3} n_x n_x v_z
\qquad
\pi_y = \frac{1}{3} n_x n_y v_x + \left(\frac{1}{3} n^2_y + 1\right) v_y + \frac{1}{3} n_z n_y v_z
\qquad
\pi_z = \frac{1}{3} n_x n_z v_x + \frac{1}{3} n_y n_z v_y + \left(\frac{1}{3} n^2_z + 1\right) v_z
\end{equation}</script>
<script type="math/tex; mode=display">% <![CDATA[
\frac{\partial F_{v_{turb}}}{\partial W} = \left[ \begin{array}{cc}
\frac{A}{\Delta x} \frac{\mu + \sigma_k \mu_t}{\rho} & 0 \\
0 & \frac{A}{\Delta x} \frac{\mu + \sigma_{\omega} \mu_t}{\rho}
\end{array} \right] %]]></script>
<script type="math/tex; mode=display">% <![CDATA[
\begin{equation} \frac{\partial S_{flow}}{\partial W} = 0
\qquad
\frac{\partial S_{turb}}{\partial W} = \left[ \begin{array}{cc}
-2 \beta^* \omega V & 0 \\
0 & -2 \beta \omega V
\end{array} \right] \end{equation} %]]></script>
<p>Once the <script type="math/tex">D</script> matrix is calulated and stored, the off-diagonal <script type="math/tex">L</script> and <script type="math/tex">U</script> matrices are computed on-the-fly during the
matrix relaxation procedure. For the scalar methods the off diagonal is further approximated as shown below. For the block
methods, the full matrix-vector multiplication is done on the off diagonals.</p>
<script type="math/tex; mode=display">0.5 \left( \frac{\partial F}{\partial W} \Delta W A + \lambda \right) \approx 0.5 \left( \Delta F + \lambda\right)</script>
<h2 id="results">Results</h2>
<p>The four implicit methods were used to solve the supersonic wedge varying the number of sweeps (Gauss Seidel or Jacobi). The
simulations were run for 10,000 iterations in parallel on 4 processors. Aither was compiled with GCC 6.1 and OpenMPI 2.0.0 on
Ubuntu 16.04. The processor used was a quad core Intel Core i7-4700MQ @ 2.4GHz. Each simulation was only run once, so this
was not a rigorous timing study. The table below shows a summary of all the cases run including the final mass residual L2
norm relative to the highest mass residual within the first 5 iterations. The matrix relaxation used is also shown. All cases
except the BLU-SGS and BDPLUR cases with the lowest number of sweeps used the default value of 1. Since the block matrix
methods have worse diagonal dominance, relaxation is occassionally needed to aid stability. All simulations were run using
local time stepping with a CFL number of 1e5.</p>
<table>
<thead>
<tr>
<th>Method</th>
<th>Sweeps</th>
<th>Relaxation</th>
<th>Final Mass Residual</th>
<th>Simulation Time (s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>LU-SGS</td>
<td>1</td>
<td>1</td>
<td>3.9302e-5</td>
<td>477.4</td>
</tr>
<tr>
<td>LU-SGS</td>
<td>2</td>
<td>1</td>
<td>2.1304e-6</td>
<td>612.8</td>
</tr>
<tr>
<td>LU-SGS</td>
<td>4</td>
<td>1</td>
<td>2.6281e-8</td>
<td>893.7</td>
</tr>
<tr>
<td>DPLUR</td>
<td>2</td>
<td>1</td>
<td>6.5039e-5</td>
<td>503.3</td>
</tr>
<tr>
<td>DPLUR</td>
<td>4</td>
<td>1</td>
<td>2.0773e-5</td>
<td>603.9</td>
</tr>
<tr>
<td>DPLUR</td>
<td>8</td>
<td>1</td>
<td>1.9359e-6</td>
<td>870</td>
</tr>
<tr>
<td>BLU-SGS</td>
<td>2</td>
<td>1.1</td>
<td>1.4966e-5</td>
<td>2013</td>
</tr>
<tr>
<td>BLU-SGS</td>
<td>4</td>
<td>1</td>
<td>3.3500e-6</td>
<td>3223</td>
</tr>
<tr>
<td>BDPLUR</td>
<td>4</td>
<td>1.2</td>
<td>4.2507e-5</td>
<td>2057</td>
</tr>
<tr>
<td>BDPLUR</td>
<td>8</td>
<td>1</td>
<td>1.6883e-5</td>
<td>3319</td>
</tr>
</tbody>
</table>
<p><img src="/downloads/MassConvergence.png" alt="Residual Convergence" class="center-image" /></p>
<center>Convergence of the mass residual.</center>
<p>The same behavior is shown with the residuals for the other equations, so they are omitted here. The number of sweeps for
the DPLUR based methods was doubled compared to their LU-SGS based counter parts because the DPLUR methods use a Jacobi
relaxation instead of the 2x more efficient Gauss Seidel relaxation. It is expected that the block matrix based methods take
longer due to the increase in computational effort required. However these methods will likely improve in later versions of
Aither as a linear matrix library such as Eigen or PETSc is slated to be used. Based on the literature it is expected that
the block matrix methods should perform better on highly stretched grids such as this one used for a RANS simulation. However,
for this case the benefit of the block matrix methods is not observed. Previous simulations have shown a small benefit to
using the block matrix methods in some cases, but usually not enough to justify their extra cost.</p>
<h2 id="conclusions">Conclusions</h2>
<p>For the supersonic wedge case analyzed here, LU-SGS has the best performance. It is stable and efficient. The block matrix based
methods show rather poor performance in terms of residual drop and simulation time, however the latter is expected to improve in
future versions of Aither. For cases other than this one, the block matrix methods have shown better convergance than their scalar
counterparts, and no need to increase the relaxation factor from the default value of 1. How do these results compare with your
experience? Comment below!</p>
<h2 id="references">References</h2>
<p><a href="http://aero-comlab.stanford.edu/Papers/AIAA-10007-471.pdf">[1]</a> Yoon, S and Jameson, A. Lower-Upper Symmetric-Gauss-Seidel Method for
the Euler and Navier-Stokes Equations. 1988. AIAA Journal Vol 26 No 9.</p>
<p><a href="http://www.dept.ku.edu/~cfdku/papers/2000-AIAAJ.pdf">[2]</a> Chen, R. F. and Wang, Z. J. Fast, Block Lower-Upper Symmetric Gauss-Seidel
Scheme for Arbitrary Grids. December 2000. AIAA Journal Vol 38 No 12.</p>
<p><a href="https://www.researchgate.net/publication/265377421_A_data-parallel_LU-SGS_method_for_reacting_flows">[3]</a> Candler, G. V. et al.
A Data-Parallel LU-SGS Method for Reacting Flows. 1994. AIAA 94-0410.</p>
<p><a href="https://www.researchgate.net/publication/245424386_A_data-parallel_LU_relaxation_method_for_the_Navier-Stokes_equations">[4]</a> Wright, M. J et al.
Data-Parallel Lower-Upper Relaxation Method for the Navier-Stokes Equations. 1996. AIAA Journal Vol 34 No 7.</p>
Sun, 25 Sep 2016 22:00:00 +0000
http://mnucci32.github.io/aither/2016/09/25/implicit-solver-comparison.html
http://mnucci32.github.io/aither/2016/09/25/implicit-solver-comparison.htmlCFDAitherC++v0.3.0LU-SGSDPLURBLU-SGSBDPLURImplicitlower upper relaxation