Introduction To Atmospheric Chemistry
Introduction To Atmospheric Chemistry
Introduction To Atmospheric Chemistry
GEOS-Chem v11-02-final will also carry the designation GEOS-Chem 12.0.0. We are migrating to
a purely numeric versioning system in order to adhere more closely to software development best
practices. For a complete description of the new versioning system, please see our GEOS-Chem
version numbering system wiki page.
1. Introduction
1.1 Overview
GEOS-Chem is a global 3-D chemical transport model (CTM) for atmospheric composition driven by meteorological
input from the Goddard Earth Observing System (GEOS) of the NASA Global Modeling Assimilation Office (GMAO). It
is applied by research groups around the world to a wide range of atmospheric composition problems. Scientific
direction of the model is provided by the GEOS-Chem Steeting Committee and by User Working Groups. The model is
managed by the GEOS-Chem Support Team, based at Harvard University and Dalhousie University.
This User's Guide provides information and instructions for installing and running GEOS-Chem. Much of the GEOS-
Chem documentation has been moved to the GEOS-Chem online wiki located at http://wiki.geos-chem.org/. Wiki
hyperlinks are provided throughout this guide directing you to where you can find in-depth information on different
GEOS-Chem topics. We encourage you to explore the GEOS-Chem wiki to supplement the information provided in this
User's Guide and to access the most up-to-date information available for GEOS-Chem. If you are new to GEOS-Chem,
please read the GEOS-Chem welcome letter for new users as well as the GEOS-Chem Overview for a brief description
of GEOS-Chem and how it is managed.
Chapter Description
2. Requirements for Installing GEOS- Lists required software packages that you need to run GEOS-
Chem Chem.
3. The NetCDF Library Describes how to load the netCDF library on your system.
4. The GEOS-Chem Shared Data Describes how to download the shared data directories for the
Directories emissions and meteorology data that drive GEOS-Chem
simulations.
5. The GEOS-Chem Source Code Describes how to download the GEOS-Chem source code.
Directory
6. The GEOS-Chem Run Directories Describes how to generate run directories for the various GEOS-
Chem simulations and the input and output files of GEOS-Chem.
7. Compiling GEOS-Chem Describes how to compile the GEOS-Chem source code into an
executable file.
8. Coding and Debugging Lists resources for writing efficient code and for debugging GEOS-
Chem.
10. GEOS-Chem Species Describes the GEOS-Chem species that are used in each type of
simulation and how you can add new species to an existing
simulation.
11. Meteorological Fields Describes the meteorological data fields used by GEOS-Chem.
12. Horizontal and Vertical Grids Describes the horizontal and vertical grids used by GEOS-Chem.
14. GEOS-Chem Reference Contains links to the reference guides that are automatically
generated from the GEOS-Chem source code and the GEOS-
Chem coding style guide.
15. GEOS-Chem Version History Lists new features in each GEOS-Chem version.
16. High Performance GEOS-Chem Describes the development of an MPI implementation for GEOS-
Chem.
Development of the GEOS-Chem model and its adjoint is a grass-roots activity by individual scientists pursuing their
own research interests and sharing their work for the benefit of all. Being a GEOS-Chem user therefore comes with
expectations and responsibilities. Please read and follow the guidelines below to help us maintain an active and vibrant
GEOS-Chem community.
1. Send an email to geos-chem-support [at] as.harvard.edu containing a paragraph describing how you and
the other members of your research group plan to use GEOS-Chem. We will add this to the GEOS-Chem
People and Projects web page. Registering your group helps us to accurately track how many research
groups are using GEOS-Chem.
2. Subscribe to the general GEOS-Chem email list and to an email list of one or more Working Groups. The
general GEOS-Chem email list is where we make announcements about new model releases, bugs and fixes,
and other information of interest to the entire GEOS-Chem community. In addition, each GEOS-Chem
Working Group has its own email list that group members use to discuss various aspects of model
development and validation. Please see the Subscribing to the GEOS-Chem email lists wiki page for
instructions.
3. We encourage you to join a Working Group that is most relevant to your area of research. The Working
Groups foster communication and collaboration between GEOS-Chem users and they identify priorities for
model development to the GEOS-Chem Steering Committee. Please take a moment to introduce yourself to
any relevant Working Group via email.
4. Adhere to the list of best practices. In particular, if you discover a problem (e.g. bugs, missing files, numerical
issues, etc.), please alert the GEOS-Chem Support Team right away. Other GEOS-Chem users will most
certainly benefit from your discovery!
5. We encourage you to send us your timing results from a 7-day time test. This will allow us to keep a list of how
the model is performing across several different platform/compiler combinations. We will post this information
on the GEOS-Chem performance wiki page.
6. Please read the GEOS-Chem Credits and References page containing guidelines for giving appropriate credit
to developers through co-authorship or citation. You may also refer to the Narrative Description of the GEOS-
Chem Model wiki page for assistance with correctly citing relevant model components of the current standard
version of GEOS-Chem in your publications.
See the following pages for a list of new features in GEOS-Chem v12.0.0 (aka v11-02-final) and for details on the 1-
month and 1-year benchmark simulations used to validate this version.
List of new intermediate benchmarked versions and the features they contain
GEOS-Chem v11-02 benchmark history
GEOS-Chem 12.0.0 (aka v11-02-final) contains fixes and for several bugs and technical issues. If you have been using
an older version of GEOS-Chem (e.g. v11-01, v10-01), please check this list to see if any outstanding issues have since
been resolved.
A few issues still remain unresolved (or cannot be resolved due to compiler bugs, etc.). For more information, please
see:
Atmospheric Sciences > GEOS-Chem Model > Manuals > GEOS-Chem Online User's Guide > Chapter 2
GEOS-Chem Online User's Guide
GEOS-Chem v11-02-final will also carry the designation GEOS-Chem 12.0.0. We are migrating to
a purely numeric versioning system in order to adhere more closely to software development best
practices. For a complete description of the new versioning system, please see our GEOS-Chem
version numbering system wiki page.
In order to use GEOS-Chem, it is helpful to have general experience with the following:
For a useful list of resorces, including tutorials, please see the GEOS-Chem basics and Version control with Git wiki
pages.
Architecture Description
Any computer system running a These include the Linux (Red Hat, CentOS, SuSE), Fedora, Ubuntu,
version of the Unix operating etc. operating systems.
system
Please see the following wiki pages for a list of required system
software (compilers, libraries, and utilities) that you will need to install
before you can run GEOS-Chem on your system.
GEOS-Chem basics
Minimum system requirements for GEOS-Chem
o Hardware requirements
o Software requirement
o Disk space requirements
If you are not sure what hardware or software is available to you, then
please check with your IT department. For the most up-to-date
information regarding GEOS-Chem performance on specific platforms
and operating systems, please see ourGEOS-Chem performance wiki
page.
NOTE: While it may be possible to run GEOS-Chem on Windows (via a
Unix emulator such as CygWin), we cannot officially support this
platform.
Amazon Web Services If your institution does not have its own computer cluster, or if you are
Elastic Compute (EC2) cloud looking for some extra computational capacity not available at your
institution, you might want to consider running GEOS-Chem on the
Amazon EC2 cloud. This brings the following advantages:
The required software packages mentioned in the prior section—the compilers, the operating system, the Git version
control system, etc.—will usually come pre-installed on your system (or machine image if you are using the Amazon
cloud). The most important items that you will still need to download to your system are the following:
Item Description
A NetCDF Library Most of the data files read by GEOS-Chem v11-02 are now
in COARDS-compliant netCDF format. As part of the High-Performance
Computing (HPC) GEOS-Chem project, we are in the process of
converting the remaining binary data files to netCDF format. This will
facilitate running GEOS-Chem in HPC environments, a requirement for
several objectives such as interfacing GEOS-Chem with the NASA
GEOS-5 GCM.
GEOS-Chem Shared Data The GEOS-Chem shared data directories contain many large files (e.g.
Directories(including the sample emissions, metorological fields) that cannot fit into your own personal
restart files) disk space. We recommend that you (or your IT staff) download the
shared data directories to a common disk space where all GEOS-Chem
users in your group or institution can access them.
If you are using the Amazon EC2 cloud, then a copy of the
shared data directories has been synced to the Amazon S3
storage system. For more information about how to access the
data, please see Jiawei Zhuang's cloud computing
tutorial cloud-gc.readthedocs.io.
If you are using your institution's computer system: You
will only need to download the shared data directories from
disk once from scratch. As new met fields or emissions data
are added, you can download the new folders and files
individually. (NOTE: If several people at your institution are
alreasy using GEOS-Chem, then chances are the shared data
directories have already been downloaded to a location on
your disk storage server. Check with your IT staff.)
GEOS-Chem Source Code The GEOS-Chem source code is available for download to your
Directory personal disk space using the Git source code management system.
GEOS-Chem Run Directories You must create a run directory for each combination of meteorology
field , grid resolution, and simulation type that you use. Run directories
are created with our GEOS-Chem Unit Tester and may be stored locally
in your personal disk space.
Contents
[hide]
1 Overview
2 Unix resources
5 Fortran resources
10 Restart files
11 Visualization packages
o 13.1 Overview
o 13.3 Logistics
14 GEOS-Chem tutorials
Overview
GEOS-Chem requirements
Before you can run GEOS-Chem, you will need to have the following items. Some of these will
be already pre-installed on your computer system.
Item Description
EITHER You will need a Unix operating system environment in order to run GEOS-Chem. Any flavor
of Unix (e.g. CentOS, Ubuntu, Fedora, etc.) should work just fine.
A Unix-based
computer system If your institution has computational resources (e.g. a shared computer cluster with many
cores, sufficient disk storage and memory), then you can run GEOS-Chem there. Contact
OR
your IT staff for assistance.
an account on
If your institution lacks computational resources (or if you need additional computational
the Amazon Web
resources beyond what is available), then you should consider signing up for access to the
Services cloud
Amazon Web Services cloud. Using the cloud has the following advantages:
You can run GEOS-Chem without having to invest in local hardware and maintenance
personnel.
You won't have to download any meteorological fields or emissions data. All of the
necessary data input for GEOS-Chem will be available on the cloud.
Your GEOS-Chem runs will be 100% reproducible, because you will initialize your
computational environment the same way every time.
You will be charged for the computational time that you use, and if you download data
off the cloud.
GEOS-Chem v11-02 will be the first version that is compatible for cloud-computing. You can
learn more about how to use GEOS-Chem on the cloud by visiting this tutorial.
We will post more information about how to get set up on the Amazon Web Services cloud
shortly.
GNU Make GNU Make directs the compilation sequence. It tells the compiler the order in which files
should be compiled, which compilation options to use.
You probably won't have to install GNU Make, since it comes with most Unix distributions
by default. GNU Make will also be available for you on the Amazon cloud.
Git (a source code The Git source-code management software is a free and open-source package that we use to
management system) enforce strict version control. You will also Git to download the GEOS-Chem source code
and the GEOS-Chem Unit Teseter.
Git is usually installed by default with most Unix distributions. It will also be available for
you on the Amazon cloud.
A Fortran compiler GEOS-Chem is written in the Fortran language. A Fortran compiler is used to create an
executable file from the GEOS-Chem source code. You can use either the GNU Fortran
Compiler (aka gfortran) or the Intel Fortran Compiler (aka ifort) to compile GEOS-Chem.
If you have an account on a shared computer cluster at your institution, chances are that you
will at least have a GNU Fortran Compiler version installed, and maybe a version of Intel
Fortran installed as well. Ask your IT staff for more information.
You will need the GNU Fortran Compiler to compile GEOS-Chem on the Amazon cloud.
This will already be available for you, so you won't have to install it manually.
A netCDF library GEOS-Chem uses the netCDF file format for I/O. Many GEOS-Chem restart and diagnostic
installation output files are written to netCDF format.
If you are using GEOS-Chem on a local computer cluster, then you (or your IT staff) will
need to install a version of netCDF. Chances are there might be one or more netCDF versions
pre-installed for you. Ask your IT staff for assistance.
If you are using GEOS-Chem on the Amazon cloud, then a netCDF library will already be
available for you to use.
A GEOS-Chem This directory contains the GEOS-Chem source code, which the compiler will assemble into
source code directory an executable file.
The GEOS-Chem source code can be downloaded via Git from our repository on
Bitbucket.org.
The GEOS-Chem The GEOS-Chem Unit Tester is used to error-check all of the GEOS-Chem simulations. It is
Unit Tester also needed to construct GEOS-Chem run directories, in which the compiled executable file
will run.
The GEOS-Chem Directory structure containing the meteorology and emissions data that GEOS-Chem reads as
shared data input.
directories
If you are using GEOS-Chem on a local computational cluster, then you will need to
download these data manually. We recommend to:
Download the emissions data for HEMCO from the Harvard data archive.
The shared data directories for GEOS-Chem v11-02 and higher versions will be available on
the Amazon cloud, so you won't have to download them. You may need to tell your login
environment where to find these data. We will post more information on this shortly.
Restart files for These are the files containing the initial conditions for a GEOS-Chem simulation. They can
GEOS-Chem be downloaded from our data archive via FTP.
GEOS-Chem v11-01 and higher versions only reads and writes restart files in netCDF
format.
A visualization This is software that is used to read and plot output from GEOS-Chem simulations.
package
Traditionally, the IDL-based GAMAP has been used for plotting GEOS-Chem-generated
data. Starting with GEOS-Chem v11-02, you will have the option to save diagnostic output
directly to netCDF data. This will give you the option to open-source plotting packages based
in the Python language.
Please also see our GEOS-Chem User's Guide for complete information about how to set up a
GEOS-Chem simulation.
--Bob Yantosca (talk) 17:20, 16 March 2018 (UTC)
GEOS-Chem documentation and support
We have compiled a list of online GEOS-Chem documentation.
Item Description
GEOS-Chem licensing
Information about the public license under which GEOS-Chem (and related
software) are distributed.
GEOS-Chem tutorial
presentations Several online tutorial presentations about how to use GEOS-Chem
Unix resources
GEOS-Chem is designed to run on computers with the Unix operating system. There is no
single version of Unix; rather, Unix comes packaged in several different distributions. Many
modern computer clusters use CentOS, which is an open-source Unix implementation. Other
systems may use a proprietary Unix distribution, such as Red Hat Enterprise. GEOS-Chem will
perform in the same way regardless of the specific Unix implementation on your system.
If you require assistance setting up or customizing your Unix login environment, please contact
your local IT staff. The GEOS-Chem Support Team can only provide support for GEOS-Chem-
related issues.
Very soon you will also have the opportunity to run GEOS-Chem on the Amazon Web Services
cloud infrastructure. We will post more information about that shortly.
IMPORTANT! Please make sure that your computer system meets the minimum system
requirements for memory and disk space in order to run GEOS-Chem.
Common Unix commands
The resources below cover many common Unix commands. You will find these useful,
particularly if you have never worked on a Unix machine before.
Perl.com
Perl in 20 pages
Beginning Perl by Simon Cozens (free online book)
Robert's Perl Tutorial
Comprehensive Perl Archive Network (CPAN.org)
--Bob Yantosca (talk) 21:14, 2 November 2016 (UTC)
Fortran resources
GEOS-Chem is written in the Fortran computer language, and relies upon of the new features
that were introduced with the Fortran-90 standard. We list below several useful resources for
your reference. Please also see our list of supported compiler versions.
Online tutorials
If you are new to Fortran (or are familiar with the older Fortran-77 standard but not Fortran-90),
then we invite you to take one or more of these tutorials:
FortranTutorial.com
Introduction to Modern Fortran (Cambridge Univ, UK)
Victor Decyk (UCLA) Fortran tutorial
Fortran 90 for the Fortran 77 programmer
Using OpenMP parallelization with Fortran
--Bob Yantosca (talk) 19:12, 2 November 2016 (UTC)
The GNU Fortran compiler
GEOS-Chem v11-01 and newer versions are compatible with the GNU Fortran compiler,
aka gfortran. This is a free and open-source compiler that comes pre-installed on many
modern computer systems. This will be our recommended compiler, starting with GEOS-Chem
v11-02.
Your system might also permit the use of Docker containers, which will allow you to load a pre-
built Unix environment with all necessary libraries, including netCDF.
--Bob Yantosca (talk) 22:18, 5 March 2018 (UTC)
netCDF references
We have also collated the following references about netCDF, which you might find useful.
Our Preparing data files for use with HEMCO page on the GEOS-Chem wiki
netCDF Home Page
netCDF Documentation
NCL (NCAR Command Language) (useful for netCDF and other file I/O)
--Bob Yantosca (talk) 21:24, 2 November 2016 (UTC)
Restart files
You will need a restart file before you can start your GEOS-Chem simulation. A restart file
contains the initial conditions for a GEOS-Chem simulation. There are two restart files for GEOS-
Chem:
ftp://ftp.as.harvard.edu/gcgrid/data/ExtData/SPC_RESTARTS/
CAVEAT: The initial restart files do not reflect the actual atmospheric state and should
only be used to "spin up" the model. In other words, they should be used as initial values
in an initialization simulation to generate more accurate initial conditions for your
production runs.
Doing a one year spin up is usually sufficient; however, we recommend ten years for ozone,
carbon dioxide, and methane simulations, and four years for radon-lead-beryllium simulations. If
you are in doubt about how long your spin up should be for your simulation, we recommend
contacting the GEOS-Chem Working Group that specializes in your area of research.
You may spin up the model starting at any year for which there is met data, but you should
always start your simulations at the month and day corresponding to the restart file to more
accurately capture seasonal variation. If you want to start your production run at a specific date,
we recommend doing a spin up for the appropriate number of years plus the number of days
needed to reach your ultimate start date. For example, if you want to do a production simulation
starting on 12/1/13, you could spin up the model for one year using the initial GEOS-FP restart
file dated 7/1/13 and then use the new restart file to spin up the model for five additional months,
from 7/1/13 to 12/1/13.
To determine the date of a netCDF restart file, you may use ncdump For example:
The -t option will return the time value in human-readable date-time strings rather than
numerical values in unit such as "hours since 1985-1-1 00:00:0.0." The date of a binary punch
restart file can be determined by opening the file in GAMAP.
Using a HEMCO restart file for your initial spin up run is optional. The HEMCO restart file
contains fields for initializing variables required for Soil NOx emissions, MEGAN biogenic
emissions, and the UCX chemistry mechanism. The HEMCO restart file that comes with a run
directory may only be used for the date and time indicated in the filename. HEMCO will
automatically recognize when a restart file is not available for the date and time required, and in
that case HEMCO will use default values to initialize those fields. You can also force HEMCO to
use the default initialization values by setting "HEMCO_RESTART" to false
in HEMCO_Config.rc. For more information, see the HEMCO User's Guide.
You can read more about restart files at the GEOS-Chem output files wiki page.
--Melissa Sulprizio (talk) 16:03, 12 January 2017 (UTC)
Visualization packages
In this section we provide information about software packages that you can use to analyze and
plot GEOS-Chem output.
GAMAP and other IDL software
NOTE: IDL, which is proprietary software, can be very expensive. For this reason,
the GEOS-Chem Support Team and other GEOS-Chem developers are currently
developing several open-source software packages (mostly based on Python) for GEOS-
Chem data analysis and visualization. Please see our Python software section below.
The traditional GEOS-Chem visualization software is GAMAP. This package was customized to
GEOS-Chem and is still heavily used today. GAMAP requires the Interactive Data Language (a
proprietary package). For more information about GAMAP, please see:
GEOS-Chem tutorials
Please see the following GEOS-Chem tutorials, mostly taken from previous GEOS-Chem
meetings:
Several GEOS-Chem users have asked "What type of machine do I need to buy in order to run
GEOS-Chem?" Here are our suggestions.
Contents
[hide]
1 Hardware Recommendations
2 Software Requirements
o 2.1 Overview
o 2.3 Parallelization
2.3.1 OpenMP
2.3.2 MPI
o 5.2 Libraries that you may need for certain data sets
Hardware Recommendations
Here is some useful information that you can use to determine if your system has sufficient
resources to run GEOS-Chem simulations.
Memory requirements
Item Description
20 GB memory (MaxRSS)
Extra memory for You may want to consider at least 30 GB RAM if you plan on doing any of the following:
special simulations
Running 2° x 2.5° global GEOS-Chem simulation and saving a lot of output fields (the
more output you generate the more memory GEOS-Chem will require)
Chris Holmes reported that a GEOS-FP 0.25° x 0.3125° NA tropchem nested simulation
required:
31 GB memory (MaxRSS)
Computer architecture
Jun Wang wrote:
We have an opportunity to build a large HPC. Do you what configuration works best for GEOS-
Chem?
This is the output of the /proc/cpuinfo file on the computational nodes we use at Harvard (i.e. the
Odyssey cluster). You can ...see how it compares with [your system].
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
stepping : 2
cpu MHz : 2494.085
cache size : 30720 KB
physical id : 0
siblings : 12
core id : 0
cpu cores : 12
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 15
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp
lm constant_tsc arch_perfmon pebs bts rep_good
xtopology nonstop_tsc aperfmperf
pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
ssse3 fma cx16 xtpr pdcm pcid
dca sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand
lahf_lm abm ida arat epb xsaveopt pln pts dts
tpr_shadow vnmi flexpriority ept vpid
fsgsbase bmi1 avx2 smep bmi2 erms invpcid
bogomips : 4988.17
bogomips : 4988.17
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
So our CPUs run at 2.5 GHz, and there are 24 CPUs/node. Each CPU has 30 MB of cache. I
think there is 4.5 to 5 GB of memory per CPU available.
Also - if you are going to use the Intel Fortran Compiler, you will always get the best
performance when using Intel CPUs. It is a known issue that the Intel Fortran Compiler
does not optimize well on AMD CPUs. This was intentional. For more information, see this
article.
GEOS-Chem v11-01 and newer versions are compatible with the GNU Fortran compiler, which
yields nearly-identical results (within the bounds of numerical noise) to simulations that use the
Intel Fortran Compiler, at the cost of somewhat slower performance. GNU Fortran should also
perform more optimally on AMD CPUs than Intel Fortran.
--Bob Yantosca (talk) 19:11, 25 April 2017 (UTC)
Network and disk
Hong Liao wrote:
We are setting up a new cluster and we have the option of installing InfiniBand network (using
Mellanox hardware) and GPFS.
My questions are:
1. Will GEOS-Chem compile and run all right on clusters based on InfiniBand and GPFS?
2. Will GEOS-Chem benefit from InfiniBand?
Bob Yantosca replied:
I can say a couple of general things:
1. On our Harvard cluster (Odyssey) we use Infiniband to connect to a fast network disk
(/n/regal). I don't know if that filesystem is GPFS.
2. For GEOS-Chem "Classic" simulations, with OpenMP parallelization, you can only use
one node of the machine. So in that case you won't be doing node-to-node
communications. The only I/O would be from the CPU to the disk, and I think in that case
Infiniband can make a great difference.
3. For GCHP, which uses MPI parallelization, then you would also have to be concerned
with inter-node communication, as you will be using CPUs across more than 1 node.
4. The disk where our met fields live at Harvard is on a Lustre file system. This is more
efficient for applications like GC that read a large volume of data.
Judit Flo Gaya replied:
What Bob said is correct, I just want to point out that GPFS and Lustre are "similar" file systems
in the sense they both are parallel filesystems, they work differently and they charge differently.
They are both a lot more cumbersome to configure, maintain and debug than regular nfs, but
they provide (when well implemented and tweaked) a significant increase in the performance of
reading and writing files.
An important (but probably understated) development is that GEOS-Chem v11-01 and newer can
now be compiled with the free and open-source GNU Fortran compiler (aka gfortran). Bob
Yantosca and Seb Eastham have removed and/or rewritten several sections of legacy code that
GNU Fortran could not compile. Due to their diligence, GEOS-Chem v11-01 is now compatible
with GNU Fortran v4, and GEOS-Chem v11-02 will be compatible with GNU Fortran v6 (the
latest version).
GNU Fortran breaks GEOS-Chem’s dependence on proprietary compilers like Intel Fortran (aka
ifort) and PGI Fortran (aka pgfortran), which can be prohibitively expensive to purchase. The
GNU Fortran (and C/C++) compilers come pre-installed on most versions of the Linux operating
system today (and if not, they are easy to install). GNU Fortran also can produce executables
that can optimize well for many different types of CPUs (Intel, AMD, etc).
To validate the performance of GEOS-Chem with GNU Fortran, we ran two 1-month benchmark
simulations for v11-02, one with Intel Fortran v11.1 and another with GNU Fortran v6.2. We
posted a summary of the results on the GEOS-Chem wiki, which you can read by clicking this
link.
As you can see from the wiki, the v11-02a benchmark using GNU Fortran gives essentially
identical results (within the bounds of numerical noise) to the v11-02a benchmark using Intel
Fortran. The run time for several GEOS-Chem operations is somewhat slower, but we believe
that this might be improved with some streamlining of code. We believe that having a longer run
time is an acceptable tradeoff for not having to purchase an expensive Intel Fortran or PGI
Fortran license.
2. Development of a Python-based visualization and regridding software for GEOS-Chem
We are developing a new visualization/regridding package for GEOS-Chem (called GCPy) that is
based on the free and open source Python programming language. Our first use of GCPy
was to create the plots for the GCHP benchmark simulations. While GCPy is currently not ready
for public use (as of April 2017), we will work on improving its usability in the very near future.
Having an option like GCPy will finally let us reduce our dependence on IDL based software (e.g.
GAMAP). An IDL license is now very expensive to purchase, and is out of reach for some GEOS-
Chem user groups.
In addition to GCPy (which is still in development), there are other Python-based visualization
packages for GEOS-Chem (developed by several members of the GEOS-Chem user
community) that you can use right away. For more information, please see our Python code for
GEOS-Chemwiki page.
Jiawei Zhuang has created a tutorial on how you can set up GEOS-Chem on the Amazon EC2
compute platform. He also has collated several of the input files that you will need to customize
your login environment on EC2. For more information, please see his Github site: https://cloud-
gc.readthedocs.io.
P.S. At present it is not yet possible to run GCHP (our high-performance GEOS-Chem) on the
Amazon EC2 platform, due to various technical issues. We will be looking into this in the near
future.
Software Requirements
Overview
Please see this list of required software packages on our GEOS-Chem basics page.
A few notes:
1. The Linux flavor (RedHat, SuSE, Fedora, Ubuntu, etc.) is not important. Also, 64-bit
architecture is not an issue with GEOS-Chem.
2. GEOS-Chem is written in the Fortran–90 language. Fortran-90 is an extension of Fortran-
77, which for many years has been the standard programming language for scientific
computing. GEOS-Chem takes advantage of several powerful features of Fortran-90,
including dynamic memory allocation, modular program design, array operation syntax,
and derived data types. Please view Appendix 7: GEOS-Chem Style Guide in the
GEOS-Chem manual for more tips on how to write effective Fortran-90 code.
3. We use the Git version control software to manage and track GEOS-Chem software
updates. Git allows users at remote sites to easily download GEOS-Chem over the
network. Git also enables users to keep track of their changes when developing the code
and enables the creation of patches that would simplify the implementation of new
developments in the standard version. For all these reasons, you must install Git so that
you can download and manage your local GEOS-Chem source code.
--Bob Yantosca (talk) 19:17, 4 November 2016 (UTC)
Supported compilers
As of 2016, the following platforms and compilers are supported. The majority of GEOS-Chem
users compile GEOS-Chem with the Intel Fortran Compiler. GEOS-Chem v11-01 and higher
versions are now compatible with the GNU Fortran compiler as well.
In GEOS-Chem v11-01 and later versions, the Fortran compiler environment
variables must be set.
Tested
Platform Compiler Status
by
NOTE: IFORT 15 has a compiler bug that causes errors when turning
on array-out-of-bounds checking and optimization.
Linux ifort 13.0.079 Supported GCST
and similar
builds
Linux ifort 10.1 Supported (but this is an old version by now) GCST
You should be aware that because GEOS-Chem uses OpenMP parallelization, you can only run
on as many nodes as are shared by the memory. For example, if you had 2 PC's, and each PC
w/ 4 cores each, then you can only run on 1 PC at a time (i.e. 4 cores). This is because OpenMP
has a requirement that all of the processors on the machine must be able to see all of the
memory on the machine. In that case, you could run 2 jobs simultaneously on 4 cores, but not a
single job on 8 cores. See OpenMP.org for more information.
Our traditional configuration of GEOS-Chem, known as "GEOS-Chem Classic", cannot use MPI
capability. It is parallelized with the OpenMP parallelization directives.
MPI
MPI (Message Passing Interface) is required for passing memory from one physical system to
another. For example, if you wanted to run a GEOS-Chem simulation across several
independent machines (or nodes of a cluster), then this requires MPI parallelization. OpenMP
parallelization cannot be used unless all of the CPUs on the machine have access to all of the
memory on the machine (a.k.a. shared-memory architecture).
Our High-performance version of GEOS-Chem (aka GCHP) can use OpenMPI and MVAPICH2
versions of the MPI parallelzation library.
--Bob Yantosca (talk) 20:09, 4 November 2016 (UTC)
COARDS-compliant netCDF
GEOS-FP 4° x 5° ~ 30 GB/yr
(compressed)
COARDS-compliant netCDF
GEOS-FP 0.25° x 0.3125° Europe ("EU") nested grid ~ 58 GB/yr
(compressed)
~ 200
MERRA 2° x 2.5° Binary (uncompressed)
GB/yr
~ 120
GEOS-5 2° x 2.5° Binary (uncompressed)
GB/yr
~ 140
GEOS-5 0.5° x 0.666° nested CH Binary (uncompressed)
GB/yr
~ 160
GEOS-5 0.5° x 0.666° nested NA Binary (uncompressed)
GB/yr
NOTES:
GEOS-Chem reads and writes data using the netCDF file format. NetCDF is a self-
describing file format that can store data fields as well as the relevant "metadata", or
information about the contents of the file. Types of metadata include descriptive names,
units, horizontal and vertical, coordinates, file creation date/time, file history, etc.
The netCDF frequently asked questions (FAQ) guide gives this short overview of netCDF:
NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access
and a freely distributed collection of data access libraries for C, Fortran, C++, Java, and
other languages. The netCDF libraries support a machine-independent format for
representing scientific data. Together, the interfaces, libraries, and format support the
creation, access, and sharing of scientific data.
A netCDF installation contains library files (ending in .a) , which hold compiled utility
routines meant to be called from programs written in C or Fortran. In netCDF-4.1 and prior
versions, the C-language library file (libnetcdf.a) and the Fortran-language library file
(libnetcdff.a) were always installed into the same folder by default. But starting with
netCDF-4.2, the netCDF Fortran libraries now must be built from a separate distribution
package. Because of this new configuration, you might find that
the libnetcdff.a (Fortran) and libnetcdf.a (C) library files are stored in separate
folders on your system. Ask your IT staff for more information about how netCDF is installed
on your system.
If you are using GEOS-Chem on the Amazon EC2 cloud, then a version of netCDF will
ship with the machine image that you'll use to initialize your computational environment.
Therefore, you will not have to install netCDF yourself (unless you are an advanced user and
need a specific version for a particular application.)
If you are using GEOS-Chem on your institution's computer system, chances are that
your IT staff will have already installed one or more netCDF library versions that you can
use. For users that do not have the netCDF libraries intstalled on their system, the GEOS-
Chem Support Team has constructed the GEOS-Chem-Libraries installer package. See
the Installing libraries for GEOS-Chem wiki page for detailed instructions. Quick links to
subsections of that wiki page are included below:
Contents
[hide]
will load the default Intel Fortran Compiler version and the default netCDF version that was built
with the Intel Fortran Compiler. You can also load specific compiler and netCDF versions by
giving the version numbers, such as:
You can place these module load commands into your .bashrc or .cshrc startup file so that
they will be executed every time you log in.
On most computer systems, the module command will also export one or more environment
variables containing the directory paths where the libraries and relevant files can be found. This
will allow you to always find the proper netCDF library on disk without having to hardwire the
directory path in your .bashrc or .cshrc system startup file.
For example, the module load command listed above will export
the NETCDF_HOME, NETCDF_INCLUDE, and NETCDF_LIB variables into your Unix environment.
The NETCDF_HOME variable is the root folder of the netCDF library. Include files (*.h) and
compiled module files (*.mod) are stored in the NETCDF_INCLUDE folder. Library files (*.a) are
stored in the NETCDF_LIB folder.
The names of these environment variables will differ from system to system. Ask your IT staff
about how the module command is implemented on your computer.
Special handling for netCDF-4.2 and higher versions
The C and Fortran library files are built from different installation packages in netCDF-4.2 and
higher versions. Because of this dichotomy, your IT staff may have installed the netCDF Fortran
library as a completely separate software module.
For example, we use these commands on the Odyssey cluster at Harvard to load the version of
netCDF that was compiled with the GNU Fortran compiler:
As you can see, the netCDF Fortran library (version 4.4.2) has to be loaded separately from the
netCDF C-language library (version 4.3.3.1). In this particular case, both the netCDF C and
Fortran libraries also rely on the OpenMPI library (although GEOS-Chem "Classic" doesn't need
it).
The above module load commands will export the following variables into your Unix
environment: NETCDF_HOME, NETCDF_INCLUDE, NETCDF_LIB, NETCDF_FORTRAN_HOME,
NETCDF_FORTRAN_INCLUDE, NETCDF_FORTRAN_LIB.
Because the netCDF Fortran library is loaded as a separate module, it has own environment
variables (NETCDF_FORTRAN_HOME, NETCDF_FORTRAN_INCLUDE, NETCDF_FORTRAN_LIB) to
define the relevant directory paths. These are analogous to NETCDF_HOME, NETCDF_INCLUDE,
and NETCDF_LIB as mentioned in the prior section.
cd spack/bin
And then define the environment variables in your .bashrc startup script (or .cshrc if you are
using csh/tcsh) to point to the following folders:
Contents
[hide]
1 Useful tools
Useful tools
There are many free and open-source software packages readily available for visualizing and
manipulating netCDF files. These tools will reduce the need for the GEOS-Chem user community
to rely on IDL (and GAMAP), which can be prohibitively expensive for some user groups. Some
recommend tools are listed below.
Name Description
ncdump This command-line tool generates a text representation of netCDF data and can be used to quickly
view the variables contained in a netCDF file. The ncdump utility is installed with your netCDF library
distribution.
ncview Visualization package for netCDF files. Ncview has limited features, but is great for getting a quick look
at the contents of netCDF files..
Panoply Data viewer for netCDF files. This package offers an alternative to ncview. From our experience,
Panoply works nicely when installed on the desktop, but is slow to respond in the Linux environment.
nco and cdo Command-line tools for manipulating and analyzing netCDF files. Useful for renaming variables and
attributes, and for regridding data.
xarray Python package that lets you read the contents of a netCDF file into a data structure. The data can
then be further manipulated or converted to e.g. numpy arrays for further processing.
xbpch Python package that lets you read the contents of a binary punch file into an xarray Dataset object.
Panoply Data viewer for netCDF files. This package offers an alternative to ncview. From our experience,
Panoply works nicely when installed on the desktop, but is slow to respond in the Linux environment.
GCPy Python based package for visualizing and analyzing GEOS-Chem output. Currently under development.
GCPy will soon be used to produce the plots from GEOS-Chem 1-month and 1-year benchmark
output.
GAMAP Data visualzation package written in IDL. Although GAMAP is currently being phased out, GAMAP
routine BPCH2COARDS is still useful for converting data stored in the GEOS-Chem "binary punch"
format to netCDF format. See our Converting files from binary punch format to netCDF section below:
Some of the tools listed above, such as ncdump and ncview, may come pre-installed on your
system. Others may need to be installed or loaded (e.g. via the module load command). Check
with your system administrator or IT staff to see what is available on your system.
--Bob Yantosca (talk) 21:49, 29 November 2018 (UTC)
#!/usr/bin/env python
import xarray as xr
import xbpch as xb
# Open the bpch file and save it into an xarray Dataset object
# NOTE: For best results, also specify the corresponding
# tracerinfo.dat diaginfo.dat metadata files
ds = xb.open_bpchdataset(filename="my_data_file.bpch",
tracerinfo_file="tracerinfo.dat",
diaginfo_file="diaginfo.dat")
Using IDL
Follow these instructions:
Use GAMAP routine BPCH2COARDS
You can use the GAMAP routine BPCH2COARDS to create netCDF files from a GEOS-Chem
binary punch file. For example, start IDL and then type this command at the IDL prompt:
uvalbedo.geos.2x25.19850101.nc
uvalbedo.geos.2x25.19850201.nc
uvalbedo.geos.2x25.19850301.nc
uvalbedo.geos.2x25.19850401.nc
uvalbedo.geos.2x25.19850501.nc
uvalbedo.geos.2x25.19850601.nc
uvalbedo.geos.2x25.19850701.nc
uvalbedo.geos.2x25.19850801.nc
uvalbedo.geos.2x25.19850901.nc
uvalbedo.geos.2x25.19851001.nc
uvalbedo.geos.2x25.19851101.nc
uvalbedo.geos.2x25.19851201.nc
Note that BPCH2COARDS will create a new file for each time slice. The %DATE% token in the output
file name will be replaced with the year-month-day value for each time stamp. In the above
example, the binary punch file uvalbedo.geos.2x25 contains monthly data,
therefore BPCH2COARDS will create 12 individual netCDF files.
NOTE: You might sometimes have better luck using the BPCH_SEP routine to split the
bpch files into smaller bpch files (e.g. one per month) band then using bpch2coards on
the smaller files.
Special note for timeseries data: To use BPCH2COARDS to convert timeseries (e.g. hourly, 3-
hourly, etc) data to netCDF format, add the %TIME% token to the netCDF file name. For example:
This will create one new netCDF file for each timestamp in the bpch file. You can then proceed to
Step 2 and Step 3 below to concatenate these files into a single netCDF file.
--Bob Y. (talk) 18:11, 1 June 2015 (UTC)
Concatenate the netCDF files
You can use the ncrcat commmand of the netCDF Operators (nco) to concatenate the 12
individual files created by BPCH2COARDSinto a single netCDF file. Make sure you have exited IDL,
and then type the following command at the Unix prompt:
You can then discard the uvalbedo.geos.2x25.1985*.nc files that were created directly
by BPCH2COARDS.
--Bob Y. 12:10, 3 March 2015 (EST)
Further Edit variable names and attributes
Whether you use Python or IDL to create a netCDF file from a bpch file, you will still need to edit
the variable attributes in order to make the file COARDS-compliant. Please see this section
below for more information.
--Bob Yantosca (talk) 15:20, 3 December 2018 (UTC)
netcdf EMEP.geos.1x1 {
dimensions:
lon = 360 ;
lat = 181 ;
time = UNLIMITED ; // (17 currently)
variables:
float lon(lon) ;
lon:standard_name = "longitude" ;
lon:long_name = "Longitude" ;
lon:units = "degrees_east" ;
lon:axis = "X" ;
lon:_Storage = "chunked" ;
lon:_ChunkSizes = 360 ;
lon:_DeflateLevel = 1 ;
float lat(lat) ;
lat:standard_name = "latitude" ;
lat:long_name = "Latitude" ;
lat:units = "degrees_north" ;
lat:axis = "Y" ;
lat:_Storage = "chunked" ;
lat:_ChunkSizes = 181 ;
lat:_DeflateLevel = 1 ;
double time(time) ;
time:standard_name = "time" ;
time:units = "hours since 1985-01-01 00:00:00" ;
time:calendar = "standard" ;
time:_Storage = "chunked" ;
time:_ChunkSizes = 524288 ;
time:_DeflateLevel = 1 ;
float PRPE(time, lat, lon) ;
PRPE:long_name = "Propene" ;
PRPE:units = "kgC/m2/s" ;
PRPE:gamap_category = "ANTHSRCE" ;
PRPE:_Storage = "chunked" ;
PRPE:_ChunkSizes = 1, 181, 360 ;
PRPE:_DeflateLevel = 1 ;
float ALK4(time, lat, lon) ;
ALK4:long_name = "Alkanes(>C4)" ;
ALK4:units = "kgC/m2/s" ;
ALK4:gamap_category = "ANTHSRCE" ;
ALK4:_Storage = "chunked" ;
ALK4:_ChunkSizes = 1, 181, 360 ;
ALK4:_DeflateLevel = 1 ;
... etc ...
// global attributes:
:CDI = "Climate Data Interface version 1.5.5
(http://code.zmaw.de/projects/cdi)" ;
:Conventions = "COARDS" ;
:history = "Wed Apr 23 17:36:28 2014: cdo mulc,10000
tmptmp.nc EMEP.geos.1x1.nc\n",
:Title = "COARDS/netCDF file created by BPCH2COARDS (GAMAP
v2-03+)" ;
:Model = "GEOS3" ;
:Grid = "GEOS_1x1" ;
:Delta_Lon = 1.f ;
:Delta_Lat = 1.f ;
:NLayers = 48 ;
:Start_Date = 19800101 ;
:Start_Time = 0 ;
:End_Date = 19810101 ;
:End_Time = 0 ;
:Delta_Time = 240000 ;
:Temp_Res = "CONSTANT" ;
:CDO = "Climate Data Operators version 1.5.5
(http://code.zmaw.de/projects/cdo)" ;
data:
You can also use ncdump to display the data values for a given variable in the netCDF file. This
command will display the values in the SpeciesRst_NO variable to the screen:
#!/usr/bin/env python
# Imports
import xarray as xr
import numpy as np
<xarray.Dataset>
Dimensions: (lat: 46, lev: 72, lon: 72, time: 1)
Coordinates:
* lon (lon) float64 -180.0 -175.0 -170.0 -165.0 -160.0
...
* lat (lat) float64 -89.0 -86.0 -82.0 -78.0 -74.0 -70.0
...
* lev (lev) float64 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
...
* time (time) datetime64[ns] 2016-07-01
Data variables:
AREA (lat, lon) float64 ...
SpeciesRst_RCOOH (time, lev, lat, lon) float32 ...
SpeciesRst_O2 (time, lev, lat, lon) float32 ...
... etc...
SpeciesRst_O3 (time, lev, lat, lon) float32 ...
SpeciesRst_NO (time, lev, lat, lon) float32 ...
Attributes:
title: GEOSChem restart
history: Created by routine NC_CREATE (in ncdf_mod.F90)
format: NetCDF-4
conventions: COARDS
Units of SpeciesRst_O3: mol/mol
Sum of SpeciesRst_O3: 0.40381380915641785
> cd /mnt/gcgrid/data/ExtData/HEMCO/GFED4/v2015-10/2013
> isCoards GFED4_3hrfrac_gen.025x025.201301.nc
===========================================================================
Filename: GFED4_3hrfrac_gen.025x025.201301.nc
===========================================================================
Operation Command
Here are some specific commands that we used on the uvalbedo.geos.2x25.nc file from our
example in a previous section. If you need to apply these commands to more than one file, you
can place them into a script.
# Add history global attribute
ncatted -a history,global,o,c,"Tue Mar 3 12:18:38 EST 2015"
uvalbedo.geos.2x25.nc
# Add references
ncatted -a references,global,o,c,"www.geos-chem.org; wiki.geos-chem.org"
uvalbedo.geos.2x25.nc
# Update title
ncatted -a title,global,o,c,"UV albedo data from Hermann & Celarier
(1997)" uvalbedo.geos.2x25.nc
Name Description
cdo The Climate Data Operators include tools for regridding netCDF files. For example, the following command
will apply distance-weighted regridding:
nco The netCDF Operators also include tools for regridding. See the Regridding section of the NCO User
Guide for more information.
xESMF Jiawei Zhuang has created xESMF, universal regridding tool for geospatial data, which is written in Python.
It can be used to regrid data not only on cartesian grids, but also on cubed-sphere and unstructured grids.
more information, see: https://xesmf.readthedocs.io
The last latitudinal band (-89.5) remains empty and gets filled with the standard missing value of
cdo, which is really large. This leads to immediate problems in the methane simulation as
enormous concentrations enter the domain from the South Pole. For now I’ve solved this
problem by just using bicubic interpolation
(2) Sal Farina wrote a simple Python script for adding a new species to a netCDF restart file:
#!/usr/bin/env python
import netCDF4 as nc
import sys
import os
f.createVariable('SpeciesRst_SOAP',o.datatype,dimensions=o.dimensions,fill_
value=o._FillValue)
soap = f['SpeciesRst_SOAP']
soap[:] = 0.0
soap.long_name= 'SOAP species'
soap.units = o.units
soap.add_offset = 0.0
soap.scale_factor = 1.0
soap.missing_value = 1.0e30
f.close()
dimensions:
time = UNLIMITED ; // (12 currently)
lev = 72 ;
lat = 181 ;
lon = 360 ;
Then you can issue this command to apply the optimal chunking along levels:
nccopy -c lon/360,lat/181,lev/1,time/1 -d1 myfile.nc tmp.nc
mv tmp.nc myfile.nc
This will create a new file called tmp.nc that has the proper chunking. We then
replace myfile.nc with this temporary file.
You can specify the chunk sizes that will be applied to the variables in the netCDF file with the -
c argument to nccopy. To obtain the optimal chunking, the lon chunksize must be identical to the
number of values along the longitude dimension (e.g. lon/360 and the lat chunksize must be
equal to the number of points in the latitude dimension (e.g. lat/181).
We also recommend that you deflate (i.e. compress) the netCDF data variables at the same time
you apply the chunking. Deflating can substantially reduce the file size, especially for emissions
data that are only defined over the land but not over the oceans. You can deflate the data in a
netCDF file by specifying the -d argumetnt to nccopy. There are 10 possible deflation levels,
ranging from 0 (no deflation) to 9 (max deflation). For most purposes, a deflation level of 1 -d1 is
sufficient.
The GEOS-Chem Support Team has created a script named nc_chunk.pl that will
automatically chunk and compress data for you. You may obtain this script from our NcdfUtilities
repository. We also recommend that you copy nc_chunk.pl into a folder that is in your search
path (such as ~/bin) so that it will be available to you in whatever directory you are working in.
You can use the ncdump -cts myfile.nc command to view the chunk size and deflation level
in the file. After applying the chunking and compression to myfile.nc, you would see output
such as this:
dimensions:
time = UNLIMITED ; // (12 currently)
lev = 72 ;
lat = 181 ;
lon = 360 ;
variables:
float PRPE(time, lev, lat, lon) ;
PRPE:long_name = "Propene" ;
PRPE:units = "kgC/m2/s" ;
PRPE:add_offset = 0.f ;
PRPE:scale_factor = 1.f ;
PRPE:_FillValue = 1.e+15f ;
PRPE:missing_value = 1.e+15f ;
PRPE:gamap_category = "ANTHSRCE" ;
PRPE:_Storage = "chunked" ;
PRPE:_ChunkSizes = 1, 1, 181, 360 ;
PRPE:_DeflateLevel = 1 ;
PRPE:_Endianness = "little" ;
float CO(time, lev, lat, lon) ;
CO:long_name = "CO" ;
CO:units = "kg/m2/s" ;
CO:add_offset = 0.f ;
CO:scale_factor = 1.f ;
CO:_FillValue = 1.e+15f ;
CO:missing_value = 1.e+15f ;
CO:gamap_category = "ANTHSRCE" ;
CO:_Storage = "chunked" ;
CO:_ChunkSizes = 1, 1, 181, 360 ;
CO:_DeflateLevel = 1 ;
CO:_Endianness = "little" ;
The attributes listed in BLUE, and which begin with an _ character are "hidden" netCDF
attributes. They represent file properties instead of user-defined properties (like the long name,
units, etc.). The "hidden" attributes can be shown by adding the -s argument to ncdump.
--Bob Yantosca (talk) 15:31, 13 April 2018 (UTC)
4.1 Overview
The GEOS-Chem shared data directories contain the various meteorological fields, emission
inventories, scale factors, and other data that GEOS-Chem reads in during the course of a
simulation.
If you are using the Amazon EC2 cloud: The GEOS-Chem shared data directories
are automatically synced to the Amazon S3 storage system. You can easily access
this data from the home directory of your Amazon EC2 cloud instance. For more
information, please see Jiawei Zhuang's comprehensive cloud computing
tutorial: cloud-gc.readthedocs.io.
If you are using your institution's computer system: You (or your IT staff) will
have to download the meteorological fields and emissions data that are used to drive
GEOS-Chem. Please proceed to Section 4.2.
Because of the large volume of data, you must download the shared data directories via FTP
or a similar utility such as wget, FireFTP, or SecureFX. Tracking the shared data directory
structure with Git is impossible due to its size. See the Minimum system requirements for
GEOS-Chem wiki page for information on typical disk space requirements for GEOS-Chem.
If your research group is setting up GEOS-Chem for the first time: You (or your IT staff)
will only have to download the full set of the GEOS-Chem shared data directories once. As
new emissions inventories and met field files become available, you can download the new
files or directories individually on an as-needed basis.
If your research group already consists of several GEOS-Chem users: The GEOS-
Chem shared directory structure should already be stored in a common disk space on your
computer cluster. In this case, you (or your IT staff) will only need to download the new
emissions inventories and related data files that were introduced in this version.
For detailed instructions on downloading the shared data directories, please see the
following wiki page subsections:
Important Note! Alll shared data directories are now subdirectories of a single root directory
called ExtData. If downloading the shared data directories for the first time, you (or IT staff)
must set up the ExtData directory prior to running GEOS-Chem. If you have previously
downloaded the GEOS-Chem shared data directories, you can simply add symbolic links
from ExtData to the existing data directories. Please see the Setting up
the ExtData directory wiki page for detailed instructions on setting up your shared data
directories so that they are compatible with GEOS-Chem.
1 Overview
o 1.1 Data directory structure prior to v10-01
o 1.2 ExtData: a new top-level directory tree
2 Creating the ExtData directory structure
o 2.1 Data directories
3 Setting directories in input.geos
Overview
Data directory structure prior to v10-01
Emissions and meteorological data for GEOS-Chem is arranged into a directory tree. In versions
of GEOS-Chem prior to v10-01, you would specify the directory where GEOS-Chem could find
the emissions and meteorological field files for the given resolution that you were using by setting
this line in your input.geos file:
In this case the root-level directory, or top of the directory tree, is /dir/to/data/ and the
specific data directory for the 4° x 5° resolution data is GEOS_4x5/.
NOTE: The example root-level directory /dir/to/data will vary from system to system. For
example, on the Harvard data servers, the root-level directory is /as/data/geos. If you are not
sure what the root-level directory is on your disk server, then ask your sysadmin or IT staff.
This would indicate you were running GEOS-Chem at 4° x 5° resolution. GEOS-Chem would
then look for the 4° x 5° meteorological field data via these data paths (all of which are
subfolders of the root data directory):
If you wanted to run GEOS-Chem at 2° x 2.5° resolution, you would use this setting for the root
data directory:
and GEOS-Chem would read the 2° x 2.5° meteorological data via these paths:
The HEMCO emissions component makes it possible to read emissions inventories and
other relevant data sets at much higher resolution than 4° x 5° or 2° x 2.5°.
In addition, HEMCO has the capability to regrid data from its native resolution to the
resolution of your GEOS-Chem simulation. We no longer need to store separate copies of
emissions data on multiple grids.
Under these circumstances, referring to a top-level directory
named GEOS_4x5 or GEOS_2x2.5 can lead to confusion.
When running GEOS-Chem in the ESMF/MAPL environment, accepted practice is to read
data from a directory tree where all of the data folders are subdirectories of a folder
named ExtData.
For all of these reasons, we have decided to restructure the GEOS-Chem data directory tree.
Starting with GEOS-Chem v10-01, all of the GEOS-Chem data directories will be subdirectories
of the ExtData directory. As explained in the next section, you can make symbolic links from
ExtData to the existing GEOS-Chem data directories.
Note: Using wget to download directories within ExtData will not work for symbolically
linked directories such as those in the ExtData/CHEM_INPUTS/ folder. When downloading
data for the first time, you will need to use wget with non-symbolically linked data
directories only. The locations of these directories for ExtData/CHEM_INPUTS/ are listed
below.
> cd /dir/to/data
> ls -1
GEOS_0.25x0.3125_CH/
GEOS_0.25x0.3125_NA/
GEOS_0.25x0.3125_NA.d@
GEOS_0.5x0.666_CH/
GEOS_0.5x0.666_CH.d@
GEOS_0.5x0.666_NA/
GEOS_0.5x0.666_NA.d@
GEOS_2x2.5/
GEOS_2x2.5.d/
GEOS_4x5/
GEOS_4x5.d/
GEOS_MEAN/
GEOS_NATIVE/
Your actual listing will differ, depending on the data you have stored on your disk server. NOTE:
In the above example, / denotes directories, and @ denotes symbolic links.
NOTE: If none of these data directories have been previously downloaded to your disk
server, then you will have to download them from one of the GEOS-Chem data archives.
See our Downloading GEOS-Chem source code and data wiki page for more instructions.
3. Cut-and-paste the directory output from Step 2 to a text editor. You'll need to use this again in
a couple of steps.
4. Create the /dir/to/data/ExtData subdirectory and switch to it.
/dir/to/data/ExtData
5. Create a symbolic link from ExtData to each directory in the listing that you saved from Step
2.
> ln -s ../GEOS_0.25x0.3125_CH
> ln -s ../GEOS_0.25x0.3125_NA
> ln -s ../GEOS_0.25x0.3125_NA.d
> ln -s ../GEOS_0.5x0.666_CH
> ln -s ../GEOS_0.5x0.666_CH.d
> ln -s ../GEOS_0.5x0.666_NA
> ln -s ../GEOS_0.5x0.666_NA.d
> ln -s ../GEOS_2x2.5
> ln -s ../GEOS_2x2.5.d
> ln -s ../GEOS_4x5
> ln -s ../GEOS_4x5.d
> ln -s ../GEOS_MEAN
> ln -s ../GEOS_NATIVE .
6. Create the subdirectory ExtData/CHEM_INPUTS and switch to it. This directory will hold
various input files needed for various chemistry modules.
/dir/to/data/ExtData/CHEM_INPUTS
7. If you already have the GEOS_NATIVE data directory on your system, create symbolic links
from ExtData/CHEM_INPUTS to the following directories in ../GEOS_NATIVE:
ln -s ../GEOS_NATIVE/FastJ_201204
ln -s ../GEOS_NATIVE/Linoz_200910
ln -s ../GEOS_NATIVE/MODIS_LAI_201204
ln -s ../GEOS_NATIVE/Olson_Land_Map_201203
ln -s ../GEOS_NATIVE/TOMAS_201402
ln -s ../GEOS_NATIVE/UCX_201403
8. If you are downloading data for the first time and do not have the GEOS_NATIVE directory on
your system, use wget to retrieve the following directories that are now symbolically linked
within ExtData/CHEM_INPUTS. On your system, these ExtData/CHEM_INPUTS/ directories will
not be symbolic links.
8a. If you plan to use the RRTMG radiative transfer model option in GEOS-Chem, then download
the following data directories, which are new to ExtData/CHEM_INPUTS:
8b. If you use GEOS-4 or GCAP met fields, then also copy this data directory
to ExtData/CHEM_INPUTS:
This data directory contains netCDF versions of the annual mean tropopause data files
used for the GEOS-4 and GCAP simulations. If you do not use these met fields, then you
can ignore this step.
9. Switch back to the ExtData directory:
> cd ../ExtData
> pwd
/dir/to/data/ExtData
10. Download the HEMCO data directories into ExtData. You can do this with our
'hemco_data_download package, which can be obtained via Git.
#################################################################
##############
#
#
# Specify the remote and local HEMCO data paths, plus other
options. #
#
#
#################################################################
##############
10c. Look at this list of emission inventories and data sets that you can use with HEMCO.
Decide which of these you would like to download. Modify
the hemcoDataDownload.rc configuration file according to these instructions.
10d. Once you have set up your configuration file, follow these instructions to download
the HEMCO data directories to /dir/to/data/ExtData.
--Bob Y. 17:29, 6 April 2015 (EDT)
Data directories
Here is a list of the data directories in the ExtData structure. NOTE: Your
listing may differ, depending on which met field data sets you have stored
on your disk server.
Directory Description
CHEM_INPUTS Non-emissions data for GEOS-
Chem chemistry modules, which
cannot be read via HEMCO:
FASTJ_201204
Symbolic link to directory
with inputs for FAST-JX
photolysis
Linoz_200910
Symbolic link to directory
with inputs for the Linoz
stratospheric ozone
chemistry module
MODIS_LAI_201204
Symbolic link to directory
with MODIS leaf area index
data (needed for drydep and
biogenic emissions)
Olson_Land_Map_201203
Symbolic link to directory
with Olson land map data
files (needed for drydep)
TOMAS_201402
Symbolic link to directory
with inputs for TOMAS
aerosol microphysics
UCX_201403
Inputs for the UCX
chemistry mechanism
ann_mean_trop_200202
Directory containing annual
mean tropopause data files
(netCDF format)
Needed for backwards
compatibility for GMAO
GEOS-
4 and GCAP simulations
RRTMG_201411/
Directory containing
climatological N2O and
CH4 profiles for input into
RRTMG.
modis_surf_201210/
Directory containing surface
albedo & emissivity for
input into RRTMG.
GEOS_0.25x0.3125_CH and Symbolic links directories that store
GEOS_0.25x0.3125_CH.d data on the GEOS-FP 0.25° x
0.3125° China nested grid:
../GEOS_0.25x0.3125_CH
../GEOS_0.25x0.3125_CH.d
../GEOS_0.25_x_0.3125_EU
../GEOS_0.25_x_0.3125_EU.d
../GEOS_0.25_x_0.3125_NA
../GEOS_0.25_x_0.3125_NA.d
../GEOS_2x2.5
../GEOS_2x2.5.d
../GEOS_4x5
../GEOS_4x5.d
../GEOS_0.5_x_0.666_CH
../GEOS_0.5_x_0.666_CH.d
../GEOS_0.5_x_0.666_EU
../GEOS_0.5_x_0.666_EU.d
../GEOS_0.5_x_0.666_NA
../GEOS_0.5_x_0.666_NA.d
etc.
NOTE: Directories ending in .d (such as GEOS_4x5.d) contain only met
field data. These are reachable by symbolic links from the corrresponding
directories not ending in .d. For example, GEOS_4x5/GEOS_FP/ links
to GEOS_4x5.d/GEOS_FP. The historical reason why this was done was to
separate met field data (which can be several GB or TB in size) from other
non-met field data files (e.g. emissions), in order to facilitate disk
management. Please see this wiki post for more information.
--Bob Y. 11:25, 10 April 2015 (EDT)
then you should see the following output in the log file:
===========================================================
====================
G E O S - C H E M U S E R I N P U T
SIMULATION MENU
---------------
Start time of run : 20160701 000000
End time of run : 20160801 000000
Run directory : ./
Data Directory : /dir/to/data/ExtData/
CHEM_INPUTS directory :
/dir/to/data/ExtData/CHEM_INPUTS/
Resolution-specific dir : GEOS_4x5/
GEOS-FP sub-directory : GEOS_4x5/GEOS_FP/YYYY/MM/
MERRA-2 sub-directory : GEOS_4x5/MERRA2/YYYY/MM/
... etc ...
Contents
[hide]
Item Description
A netCDF Most of the data files read by GEOS-Chem v11-01 are now in
library COARDS-compliant netCDF format. As part of our High-
installation. Performance Computing (HPC) GEOS-Chem project, we are in the
process of converting the remaining binary data files to netCDF
format. This will facilitate running GEOS-Chem in (HPC)
environments, such as our efforts to interface GEOS-Chem with the
NASA GEOS-5 GCM.
If you are using a shared computer system, then your IT staff may
have already pre-built a version of the netCDF library that you can
load into your Unix environment. If not, then you can build a
netCDF library with our installer package.
For more information, please see the following resources:
GEOS–Chem The GEOS-Chem shared data directories contain many large files,
shared data such as:
directories.
Meteorological data (a.k.a. the "met fields) used to drive GEOS–
Chem
Emissions inventories for the HEMCO emissions component
Scale factors to used to scale emissions from a base year to a
given year
Oxidant (OH, O3) concentrations for both full-
chemistry and specialty simulations
Sample restart files for the various types of GEOS-Chem
simulations.
These data directories are too large to fit into your own personal
disk quota. We recommend that you (or your IT staff) download the
shared data directories to a common disk space where all GEOS-
Chem users in your group can access them.
You will only need to download the shared data directories from
disk once from scratch. As new met fields or emissions data are
added, you can download the new folders and files individually.
For more information, please see the following resources:
GEOS–Chem This is the directory where the Fortran-90 source code files
source code (i.e. *.F, *.F90 files) and Makefiles reside. Your Fortran compiler
directory. will create an executable (geos) from these source code files.
GEOS–Chem You can create run directories for the various GEOS-Chem
run directories. simulations with our GEOS-Chem Unit Tester. You can store the run
directories in your personal disk space.
ftp ftp.as.harvard.edu
We recommend that you use the wget utility to download these directories instead of anonymous
FTP. Wget allows you to download multiple directories at once, with a command such as this
one:
1. You will need to download the "CN" (constant) data files for each horizontal grid
that you are using.
For GEOS-FP these are timestamped for 2011/01/01 and are found in these
data directories of ftp.as.harvard.edu:
gcgrid/data/GEOS_4x5/GEOS_FP/2011/01/GEOSFP.20110101.4x5.nc
gcgrid/data/GEOS_2x2.5/GEOS_FP/2011/01/GEOSFP.20110101.2x25.nc
gcgrid/data/GEOS_0.25x0.3125_CH/GEOS_FP/2011/01/GEOSFP.2011010
1.0.25x0.3125.CH.nc
gcgrid/data/GEOS_0.25x0.3125_NA/GEOS_FP/2011/01/GEOSFP.2011010
1.0.25x0.3125.NA.nc
For MERRA-2 these are timestamped for 2015/01/01 and are found in these
data directories of ftp.as.harvard.edu:
gcgrid/data/GEOS_4x5/MERRA2/2015/01/20150101.cn.4x5
gcgrid/data/GEOS_2x2.5/MERRA2/2015/01/20150101.cn.2x25
gcgrid/data/GEOS_0.5x0.625_AS/MERRA2/2015/01/MERRA2.20150101.0
.5x0.625.AS.nc
gcgrid/data/GEOS_0.5x0.625_NA/MERRA2/2015/01/MERRA2.20150101.0
.5x0.625.NA.nc
Also note: NASA/GMAO recommends to discontinue use of the GEOS-5 and MERRA met
field data products, as these have now been superseded by GEOS-FP and MERRA-2.
The path gcgrid/data/ExtData/ is the root path under which the GEOS–Chem shared data
directories reside. This is where you will find the GEOS-Chem met fields and emissions data.
Starting with GEOS-Chem v10-01, the data directory naming structure has changed since we no
longer need to store emissions data on multiple grids. All of the GEOS-Chem data directories are
now subdirectories of the ExtData directory. For more information, please see our Setting up the
ExtData directory wiki page.
Directory Description
Directory Description
Restart files
Model output (bpch and netCDF formats)
Log files
Input files
Evaluation plots
--Bob Yantosca (talk) 21:46, 19 December 2016 (UTC)
Dalhousie data directory archive
The GEOS-Chem data and meteorological fields used by Dalhousie University are also available
via anonymous FTP from:
ftp rain.ucis.dal.ca
We recommend that you use the wget utility to download these directories instead of anonymous
FTP. Wget allows you to download multiple directories at once. The Wget command will take the
form:
1. You will need to download the "CN" (constant) data files for each horizontal grid
that you are using.
For GEOS-FP these are timestamped for 2011/01/01 and are found in these
data directories of rain.ucis.dal.ca:
ctm/GEOS_4x5.d/GEOS_FP/2011/01/GEOSFP.20110101.4x5.nc
ctm/GEOS_2x2.5.d/GEOS_FP/2011/01/GEOSFP.20110101.2x25.nc
ctm/GEOS_0.25x0.3125_CH.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25
x0.3125.CH.nc
ctm/GEOS_0.25x0.3125_EU.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25
x0.3125.EU.nc
ctm/GEOS_0.25x0.3125_NA.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25
x0.3125.NA.nc
For MERRA-2 these are timestamped for 2015/01/01 and are found in these
data directories of rain.ucis.dal.ca:
ctm/GEOS_4x5.d/MERRA2/2015/01/20150101.cn.4x5
ctm/GEOS_2x2.5.d/MERRA2/2015/01/20150101.cn.2x25
ctm/GEOS_0.5x0.625_AS.d/MERRA2/2015/01/MERRA2.20150101.0.5x0.6
25.AS.nc
ctm/GEOS_0.5x0.625_EU.d/MERRA2/2015/01/MERRA2.20150101.0.5x0.6
25.EU.nc
ctm/GEOS_0.5x0.625_NA.d/MERRA2/2015/01/MERRA2.20150101.0.5x0.6
25.NA.nc
Global MERRA-2 data at 0.5° x 0.625° resolution and GEOS-FP data at 0.25° x 0.3125°
resolution have been moved to the Compute Canada data directory archive.
Directory Description
http://geoschemdata.computecanada.ca
We recommend that you use the wget utility to download these directories. Wget allows you to
download multiple directories at once. The Wget command will take the form:
1.
For GEOS-FP these are timestamped for 2011/01/01 and are found in these
data directories of rain.ucis.dal.ca:
GEOS_4x5.d/GEOS_FP/2011/01/GEOSFP.20110101.4x5.nc
GEOS_2x2.5.d/GEOS_FP/2011/01/GEOSFP.20110101.2x25.nc
GEOS_0.25x0.3125_CH.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25x0.3
125.CH.nc
GEOS_0.25x0.3125_EU.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25x0.3
125.EU.nc
GEOS_0.25x0.3125_NA.d/GEOS_FP/2011/01/GEOSFP.20110101.0.25x0.3
125.NA.nc
GEOS_0.25x0.3125/GEOS_FP/2011/01/GEOSFP.20110101.CN.025x03125.
nc
For MERRA-2 these are timestamped for 2015/01/01 and are found in these
data directories of rain.ucis.dal.ca:
GEOS_4x5.d/MERRA2/2015/01/20150101.cn.4x5
GEOS_2x2.5.d/MERRA2/2015/01/20150101.cn.2x25
GEOS_0.5x0.625_AS.d/MERRA2/2015/01/MERRA2.20150101.0.5x0.625.A
S.nc
GEOS_0.5x0.625_EU.d/MERRA2/2015/01/MERRA2.20150101.0.5x0.625.E
U.nc
GEOS_0.5x0.625_NA.d/MERRA2/2015/01/MERRA2.20150101.0.5x0.625.N
A.nc
GEOS_0.5x0.625/MERRA2/2015/01/MERRA2.20150101.CN.05x0625.nc
Directory Description
If you wish to trim the name of the downloaded directory (i.e., so it downloads
as DIRECTORY_NAME, not pub/geos-chem/data/DIRECTORY_NAME), then use the --cut-
dirs option:
Due to the huge volume of data involved, this is not recommended, as the file downloads may
swamp your system. It's better to do download the data smaller chunks. For example:
2. Download all GEOS-5 met data at 2° x 2.5° resolution:
Maybe this is common knowledge, but I just discovered that using the -N option in wget
ensures that only files with newer timestamps than what resides on my local machines
are downloaded - found this very useful to update my shared data directories.
--Melissa Payer 10:39, 1 June 2012 (EDT)
ftp ftp.as.harvard.edu
cd gcgrid/geos-chem/data/ExtData/GCAP_4x5
cd gcgrid/geos-chem/data/ExtData/GEOS_0.25x0.3125_CH
cd gcgrid/geos-chem/data/ExtData/GEOS_0.25x0.3125_NA
cd gcgrid/geos-chem/data/ExtData/GEOS_0.5x0.666_CH
cd gcgrid/geos-chem/data/ExtData/GEOS_0.5x0.666_NA
cd gcgrid/geos-chem/data/ExtData/GEOS_2x2.5
cd gcgrid/geos-chem/data/ExtData/GEOS_4x5
cd gcgrid/geos-chem/data/ExtData/GEOS_MEAN
cd gcgrid/geos-chem/data/ExtData/GEOS_NATIVE
cd gcgrid/geos-chem/data/ExtData/HEMCO
cd gcgrid/geos-chem/data/ExtData/CHEM_INPUTS
data/GEOS_4x5/GEOS_5/YYYY/MM
However, on the ftp site, we find a directory structure with an extra '.d':
data/GEOS_4x5.d/GEOS_5/YYYY/MM
(the GEOS_5 folder is in GEOS_4x5.d rather than GEOS_4x5). There does exist
a GEOS_4x5 that contains many of the emissions data, but does not contain GEOS_5.
If we leave the structure as is, and enter ../data/GEOS_4x5/ as our root data directory
in input.geos, we get a file not found error when it looks for GEOS_5 within this directory
(obviously).
If we instead enter ../data/GEOS_4x5.d as our root data directory, we get a file not
found error when the program looks for emissions within this directory (lightning NOx
emissions, in this case).
QUESTION: To solve this problem, we have moved the GEOS_5 folder into
the GEOS_4x5 directory. [Is this] okay?
Bob Yantosca replied:
The only difference on our system between e.g. GEOS_4x5 and GEOS_4x5.d is that our
sysadmin (Jack Yatteau) set up the ".d" directories separately so that they only contain
met data (which is much larger than the emissions etc. data). That way he could separate
the disks that just had met data from the disk that have the emissions data to facilitate
our configuration here. There are symbolic links from GEOS_4x5 to GEOS_4x5.d etc. (i.e.
the directory GEOS_4x5/GEOS_5 is actually a symbolic link to the corresponding directory
in GEOS_4x5.d/GEOS_5/ and etc. for the other met field resolutions & directories).
You don't necessarily have to do this on your end, but this is what we did here. You can
just make the GEOS_4x5/GEOS_5 etc. real subdirectories and not symbolic links and
store the data there. The solution you picked above is OK.
Also to facilitate FTP file transfer, you could do the following:
I wanted to alert you about something you might know already. I have been trying to run
Tomas Sherwen's halogen chem code here and was seeing some substantial differences
in my output files compared to his, and I finally traced them to differences in *cld* met
files between the Harvard (where I got my met files) and Dalhousie (where he got his met
files) archives.
I have only looked at the GEOS_FP 4x5 2013 July *cld* files - I don't know if this is a
more widespread issue affecting other met files/met products/years/resolutions. But
thought I should alert you.
Chi Li wrote:
I have retrieved the original GMAO GEOS-FP data for July, 2013, and generated the 4x5
A3cld data. I double checked the code and recompiled it, to guarantee that the new way
of calculating CLOUD is included in the new processing.
I read in “CLOUD”, “OPTDEPTH” variables and compared the values. The differences
were 0 everywhere and every time step when compared with the Dalhousie data for
every day. Meanwhile when compared with the Harvard data, the maximum differences
in OPTDEPTH could reach ~15 for July 5-31.
I did not compress these data to nc4 but they readily agree with the Dalhousie data. So
from my view I would say the Dalhousie archive should represent the more recent
updates, at least for this month.
I am not sure how to look more into the difference between Harvard and Dal data. It is
strange that the Harvard data agree exactly with Dal and newly processed data on every
level and every time step for July 1-4. Anyway, if you recall something else to check with,
I am willing to help.
Bob Yantosca wrote:
Odd – it seems like on our end, July 1-4 files were created after the commit that was
made to use CLOUD (on 2013-12-12) but July 5-31 were made before (on 2013-11-23 or
thereabouts). I am not sure what happened. Now I’m thinking that the Harvard files for
this month might not be correct.
One thing – since we use July 2013 as our benchmarking year, this will affect the GC
benchmarks, if we change the met fields.
--Melissa
Sulprizio (talk)
20:13, 15
August 2017
(UTC)
Wrong No.
of vertical
layers in
A3mstE
files for
0.5x0.625
nested
regions
NOTE: This
inconsistenc
y is fixed
since
201710, but
not for
historical
data archive
since we are
uncertain
whether it
actually
would affect
the
simulation.
We welcome
future
inquiry and
discussion
from users of
the
GEOS_FP
0.5x0.625
data.
Chi Li wrote:
I have another bug in the code to report. After examining the processing Met field
processing code, I found another bug that affect the 0.5 degree GEOS-FP nested data
(NA, EU, AS, SE, CH regions). That is, the code mistakenly output 72 layers of data in
the “A3mstE” files instead of 73. This is due to:
IF
(
doNestCh
05 )
THEN
fName =
TRIM(
tempDirT
mplNestC
h05 ) //
TRIM(
dataTmpl
NestCh05
)
gName =
'nested
CH 05'
CALL
ExpandDa
te (
fName,
yyyymmdd
,
000000
)
CALL
StrRepl
( fName,
'%%%%%%'
,
'A3mstE'
)
CALL
NcOutFil
eDef(
I_NestCh
05,
J_NestCh
05,
L05x0625
,
TIMES_A3
, &
xMid_05x
0625(I0_
ch05:I1_
ch05),
&
yMid_05x
0625(J0_
ch05:J1_
ch05),
&
zEdge_05
x0625,
a3Mins,
&
gName,
fName,
fOut05Ne
stCh
)
ENDIF
The “L05x0625” (72 layers) should be ‘L05x0625+1’ (73 layers). Same problems exist for
the other 4 nested regions. The nested regions at native resolution (0.25 degree) are not
affected.
Bob
Yant
osca
wrote
:
As far as I know, nobody is using the GEOS-FP half-degree nested grids (or if there are,
we haven’t heard from them).
Also, for the A3mstE data, all of the data fields (e.g. CMFMC) are zero at the top of the
atmosphere anyway. Those are all cloud fields, which go to zero at about 20km above
the surface.
HEMCO data directories
On this page we describe the directory tree from which the HEMCO emissions component can
read emissions inventories and other atmospheric data sets.
Contents
[hide]
1 Overview
2 Default GEOS-Chem emissions configurations
3 HEMCO data directory structure
o 3.1 Aerosol emissions
o 3.2 Anthropogenic and biofuel emissions
o 3.3 Anthropogenic aircraft and ship emissions
o 3.4 Biomass burning emissions
o 3.5 Emissions implemented as HEMCO extensions
o 3.6 Future and historical emissions
o 3.7 GEOS-Chem specialty simulation data
o 3.8 Halogen emissions
o 3.9 Natural emissions data
o 3.10 Non-emissions data
o 3.11 Oceanic emissions
o 3.12 Other inputs for HEMCO
4 Downloading the HEMCO data directories
o 4.1 Obtaining the hemco_data_download package
o 4.2 Setting up the configuration file
o 4.3 Downloading the data
o 4.4 New features for the GEOS-Chem v10-01 public release
5 Submitting new data for use with HEMCO
Overview
The HEMCO emissions component can read several types of emission inventories, as well as
other types atmospheric data sets, such as production and loss rates, or concentration data. We
have collated all of this data into a comprehensive directory tree structure. Each folder of the
HEMCO data directory tree represents a particular emissions inventory or other data set.
At present, the HEMCO data directory tree resides on the disk servers at Harvard University (and
soon at Dalhousie University). We have created a package that will let you download this
directory tree to your local disk storage space. For more information, please see
the Downloading the HEMCO data directories section below.
Please see our list of recommended default emission inventories, which has been compiled by
the Emissions and Deposition Working Group.
--Bob Yantosca (talk) 22:27, 7 September 2017 (UTC)
The Inventory column contains a link that describes each data set in more detail. Most of
these links point to existing pages on the GEOS-Chem wiki.
The Data file info column points to the README files that are stored with each data set.
Each README provides a list of the files contained within a given folder, information
about the data contained in each file, and (often) a description of how the files were
created.
The Path column shows the location of each data set, with respect to $ROOT, the top-
level HEMCO directory. For example, on the Harvard disk server, $ROOT points to the
directory /mnt/gcgrid/data/ExtData/HEMCO/.
The Status column describes the current status of each data set:
CURRENTLY Denotes that the data set is part of the standard emissions
USED configuration for GEOS-Chem, that is:
But you can still use the data set in your research
applications if you wish.
TO BE ADDED Denotes that the data set is not yet ready for use with
SOON HEMCO, but will be added soon.
OBSOLETE Denotes that the data set has been removed from the
standard GEOS-Chem emissions configuration.
Data Direct
Inventor
file Path ory Status
y
info size
Active inventories (turned ON by default in the standard
emissions configuration)
AEROC READ $ROOT/VOLCANO/v20 1.6 GB CURRENT
15-02
OM ME LY USED
volcanic
emission Default
s global
invento
ry for
volcani
c SO2.
Tami READ $ROOT/BCOC_BOND/v 3.0 MB CURRENT
2014-07
Bond et ME LY USED
al (2007)
BC and Default
OC global
emission
invento
s
ry for
black
carbon
(BC)
and
organic
carbon
(OC).
Secondar READ $ROOT/SOA/2014-07 3.0 MB CURRENT
y organic ME LY USED
aerosols
Contain
various
inputs
for
SOA
simulati
ons.
Optional inventories (turned OFF by default in the standard
emissions configuration)
Spatially READ $ROOT/OMOC/v2018- 8.6 MB OPTIONA
01
varying ME L
OM/OC
ratios for Control
SOA led by a
species
switch
in inpu
t.geos
(default
is OFF)
Future inventories (to be added in an upcoming version)
OMI- READ $ROOT/VOLCANO/v20 12 MB TO BE
18-03
based ME ADDED
volcanic SOON
emission
s Will be
introdu
ced
in GC
12.0.3
Covera
ge
extends
from
2005-
2012.
Will
replace
the
AERO
COM
volcani
c
emissio
ns.
Obsolete inventories (superseded by newer developments)
Cooke et READ $ROOT/BCOC_COOKE/ 744 KB NOT
v2014-07
al BC ME USED
and OC
emission Superse
s ded by
Bond et
al
(2007).
--Bob Yantosca (talk) 18:49, 2 July 2018 (UTC)
Anthropogenic and biofuel emissions
The following subdirectories of the HEMCO directory tree contain
inventories of anthropogenic and biofuel emissions.
Please also see list of species included in each of these global
inventories and regional inventories.
Data Direct
Invent
file Path ory Status
ory
info size
Active inventories (turned ON by default in the standard
emissions configuration)
AEIC READ $ROOT/AEIC/v2015-01 2.0 GB CURRENT
aircraft ME LY USED
Default
global
aircraft
emissio
ns
invento
ry.
Contain
s fuel
burned,
NO,
CO,
and
hydroca
rbons.
EMEP READ $ROOT/EMEP/v2015-03 24 MB CURRENT
ship ME LY USED
(CO,
NO, Overwr
SO2) ites
ship
CO,
NO,
SO2
over
Europe.
Stored
togethe
r w/
other
EMEP
data
files.
Optional inventories (turned OFF by default in the standard
emissions configuration)
ARCT READ $ROOT/ARCTAS_SHIP/v 508 OPTIONA
2014-07
AS ship ME KB L
emissio
ns Superse
(SO2) ded by
CEDS
ship
emissio
ns
in GEO
S-
Chem
12.1.0 a
nd later
version
s.
See
notes
below.
ICOAD READ $ROOT/ICOADS_SHIP/v 4.2 MB OPTIONA
2014-07
S ship ME L
(CO)
Superse
ded by
CEDS
ship
emissio
ns
in GEO
S-
Chem
12.1.0 a
nd later
version
s.
See
notes
below.
Corbett READ $ROOT/CORBETT_SHIP/ 1.8 MB OPTIONA
v2014-07
et al ME L
ship
emissio Superse
ns ded by
(SO2)
CEDS
ship
emissio
ns
in GEO
S-
Chem
12.1.0 a
nd later
version
s.
See
notes
below.
HTAP READ $ROOT/HTAP/v2015-03 4.5 GB OPTIONA
ship ME L
(CO,
NO, Stored
SO2) togethe
r with
other
HTAP
data
files.
See this
docume
nt for
more
informa
tion.
NEI201 READ $ROOT/NEI2011/v2015 248 OPTIONA
-03
1 ship ME GB L
(several
species NOTE:
) Only
extends
a few
km
from
land, so
this is
more of
a
coastal
emissio
ns
invento
ry. It is
turned
off by
default.
Obsolete inventories (superseded by newer developments)
EDGA READ $ROOT/EDGAR/v2014- 40 MB OBSOLET
07
R v3 ME E
ship
(CO)
EDGA READ $ROOT/EDGARv42/v201 4.5 GB NOT
5-02
R v4.2 ME USED
ship
EDGA
R v4.2
ship
emissio
ns are
not
used
because
they are
lumped
with
other
non-
road
transpo
rtation
sectors
(aircraft
, rail,
etc),
and
thus
cannot
be
easily
separat
ed.
Data Direct
Inventor
file Path ory Status
y
info size
Active inventories (turned ON by default in the standard
emissions configuration)
Acetone READ $ROOT/ACET/v201 52 KB CURRENTL
4-07
ocean ME Y USED
exchange
DEAD READ $ROOT/DUST_DEAD 712 CURRENTL
/2014-07
dust ME KB Y USED
model
Default
dust
mobilizati
on scheme
Anthropo READ $ROOT/DUST_DEAD 4.4 CURRENTL
/2018-04
genic ME MB Y USED
PM2.5
dust Introduced
source in GEOS-
(AFCID)
Chem
12.1.0.
AFCID
will be
used if the
DEAD
dust
emissions
extension
is
activated.
DMS READ $ROOT/DMS/v2015 3.0 CURRENTL
-07
ocean ME MB Y USED
exchange
This is the
default
in v11-
01 and
higher
versions
Uses Lana
2011
climatolog
y
MEGAN READ $ROOT/MEGAN/v20 17 MB CURRENTL
17-07
biogenic ME Y USED
emissions
Now uses
high-
resolution
(0.25°)
data files
READ $ROOT/MEGAN/v20 12 MB CURRENTL
18-05
ME Y USED
Introduced
in v11-02-
rc
Fixes a
regridding
issue in
the high-
resolution
MEGAN
AEF data
files.
NO from READ $ROOT/LIGHTNOX/ 8.5 CURRENTL
v2017-09
lightning ME MB Y USED
Introduced
in GEOS-
Chem
v11-02
GEOS-FP
OTD-LIS
factors for
2012/04 -
2017/07
MERRA-
2 OTD-
LIS
factors for
any date
NO from READ $ROOT/LIGHTNOX/ 13 MB CURRENTL
v2014-07
lightning ME Y USED
CDF table
from Ott
et al.
[JGR,
2010]
OTD-LIS
factors
supersede
d
by LIGHTN
OX/v2017-
09
Direct
Invento Data
Path ory Status
ry file info
size
Active inventories (turned ON by default in the standard
emissions configuration)
Aerosol- READ $ROOT/OFFLINE_AEROS 165 CURREN
OL/v2014-09
only ME MB TLY
simulati USED
on
CH4 READ $ROOT/CH4/v2014-09 274 CURREN
simulati ME MB TLY
on USED
CO2 READ $ROOT/CO2/v2015-04 1033 CURREN
$ROOT/CO2/v2015-
simulati ME1 04/BIO/ MB TLY
on READ $ROOT/CO2/v2015- USED
ME2 04/BIOFUEL/
$ROOT/CO2/v2015-
READ 04/CHEM/
ME3 $ROOT/CO2/v2015-
READ 04/FOSSIL/
$ROOT/CO2/v2015-
ME4 04/OCEAN/
READ
ME5
READ
ME6
Mercury READ $ROOT/MERCURY/v2014 342 CURREN
-09
simulati ME1 $ROOT/MERCURY/v2014 MB TLY
on READ -09/ARTISANAL USED
ME2 $ROOT/MERCURY/v2014
-09/BrOx
READ $ROOT/MERCURY/v2014
ME3 -09/Hg2_PARTITION
READ $ROOT/MERCURY/v2014
-09/JVALUES
ME4 $ROOT/MERCURY/v2014
READ -09/NATURAL
ME5 $ROOT/MERCURY/v2014
-09/NEI2005
READ $ROOT/MERCURY/v2014
ME6 -09/OCEAN
READ $ROOT/MERCURY/v2014
-09/SOIL
ME7 $ROOT/MERCURY/v2014
-09/STREETS
READ
ME8
READ
ME9
READ
ME10
POPs READ $$ROOT/POPs/v2015- 809 CURREN
08
simulati ME MB TLY
on USED
Data
corres
ponds
to
the PO
Ps
simula
tion
update
in
v11-
01c
RRTM READ $ROOT/RRTMG/v2018- 19 MB TO BE
11
G ME ADDED
radiative SOON
transfer
model Used
in GE
OS-
Chem
12.1.0
and
later
versio
ns
Tagged READ $ROOT/TAGGED_CO/v20 260 CURREN
17-04
CO ME KB TLY
simulati USED
on
Contai
ns
files
for the
update
d
Tagge
d CO
simula
tion
in GE
OS-
Chem
v11-
02
Tagged READ $ROOT/TAGGED_O3/v20 372 CURREN
14-09
O3 ME MB TLY
simulati USED
on
O3 for READ $ROOT/O3/v2014-09 130 CURREN
offline ME MB TLY
simulati USED
ons
OH for $ROOT/OH/v2014-09 148 CURREN
$ROOT/OH/v2014-
offline READ 09/v5-07-08 MB TLY
simulati ME1 $ROOT/OH/v2014- USED
ons READ 09/v7-02-03.GMI
ME2
H2O2 READ $ROOT/OXIDANTS/v201 32 MB CURREN
4-07
for ME TLY
offline USED
simulati
ons
Oceanic READ $ROOT/CHLA/v2014-07 2.0 CURREN
Chlorop ME MB TLY
hyll-A USED
for Hg
simulati
ons
Obsolete inventories (superseded by newer developments)
CH3I READ $ROOT/CH3I/v2014-07 280 OBSOLE
simulati ME KB TE
on
This
simula
tion is
no
longer
used
in
GEOS
-Chem
(await
ing
reviva
l)
POPs READ $ROOT/POPs/v2014-09 622 OBSOLE
simulati ME MB TE
on
Supers
eded
by v20
15-08
Data Direct
Inventor
file Path ory Status
y
info size
Active inventories (turned ON by default in the standard
emissions configuration)
GMI READ $ROOT/GMI/v201 16 GB CURRENTLY
5-02
strat ME USED
chem
mechani Used to
sm compute P &
L of species
in the
stratosphere.
Stratosp READ $ROOT/STRAT/v2 385 CURRENTLY
015-01/Bry
heric ME MB USED
Bry from
CCM Required for
the for the
full-chemistry
simulations.
Timezon READ $ROOT/TIMEZONE 264 CURRENTLY
S/v2015-02
e offsets ME KB USED
from
UTC Used by
HEMCO to
compute
emissions that
depend on
local time
rather than on
UTC time.
TOMS/S READ $ROOT/TOMS_SBU 21 CURRENTLY
V/v2016-11
BUV O3 ME MB USED
columns
This is default
in v11-01 and
higher
versions.
These files are
the same data
as
in $ROOT/TOMS
_SBUV/v2015-
03,but were
reprocessed
by Barron
Henderson in
order in order
to fix a
strange cycle
in OH output
when running
GEOS-Chem
with GEOS-5
met.
Ignored if you
are
using GEOS-
FP or MERR
A-
2 meteorology
.
UCX READ $ROOT/UCX/v201 12 GB CURRENTLY
8-02
chemistr ME USED
y
mechani Used to
sm compute P &
L of species
in the
stratosphere.
Can be used
instead of
GMI.
UV READ $ROOT/UVALBEDO 476 CURRENTLY
/v2015-03
surface ME KB USED
albedoes
Required
inputs for
the FAST-JX
v7.0
photolysis
mechanism
Obsolete inventories (superseded by newer developments)
TOMS/S READ $ROOT/TOMS_SBU 17 OBSOLETE
V/v2015-03
BUV O3 ME MB
columns Superseded
by v2016-11
--Bob Yantosca (talk) 19:04, 2 July 2018 (UTC)
Oceanic emissions
The following folders contain data used to compute oceanic emissions
of GEOS-Chem species.
Data Directo
Inventory Path Status
file info ry size
Fields to READ $ROOT/ALD2/v2 0.5 MB CURRENT
017-03
compute ME LY USED
ALD2
emissions Introdu
ced
Seawater with
concentra the PA
tion of N
acetaldeh updates
yde in v11-
Heterotro 02a.
phic
respiratio
n rates,
used to
compute
biogenic
emissions
of ALD2
and EOH
--Bob Yantosca (talk) 17:14, 28 June 2018 (UTC)
Other inputs for HEMCO
The following subdirectories of the HEMCO directory tree input data for
various HEMCO functions. These include regional masks, emission
scale factors, and grid information.
git clone
https://github.com/GCST/hemco_data_download.git
File Description
README File with an overall description of the
directory contents
hemcoDataDownload.pl Perl script to download HEMCO data
directories
hemcoDataDownload.rc Configuration file for
the hemcoDataDownload.pl script. In
this file you can specify which
HEMCO data directories you would
like to download and which you would
like to omit.
forTesting.rc A configuration file that you can use
for testing or debugging. This will
tell hemcoDataDownload.pl only to
download a couple of emissions
inventories with files that do not take
up much disk space.
--Bob Yantosca (talk) 17:02, 5 December 2016 (UTC)
Setting up the configuration file
The configuration files
(i.e. hemcoDataDownload.rc and forTesting.rc) are pretty much
self-explanatory.
At the top of the configuration file you will see this section:
########################################################
#######################
#
#
# Specify the remote and local HEMCO data paths, plus
other options. #
#
#
########################################################
#######################
Path Description
Remote Location on the FTP server from which you are
HEMCO data
path going to download the data. This can be from
either Harvard or from Dalhousie. (For now we
will use the Harvard server). You can edit this
accordingly.
Your HEMCO The root-level directory for HEMCO data on your
data path
own disk space. If you are not sure where to place
this, then ask your sysadmin.
Verbose Lets you specify if you want to print out extra
output
output during the download process. This can be
set to either "yes" or "no".
Dryrun Allows you to print out the data download
only
commands without actually downloading the data.
This is useful for debugging. This can be set to
either "yes" or "no".
In the next section you specify all of the HEMCO inventories that you
want to download. You will see this header:
########################################################
#######################
#
#
# THE FOLLOWING DATA DIRECTORIES WILL BE DOWNLOADED.
#
#
#
# These data directories comprise the recommended
emissions configuration #
# for typical GEOS-Chem full-chemistry and specialty
simulations. #
#
#
# NOTE: In most cases, you only have to specify the
top-level folder. #
# All subfolders will be downloaded automatically.
#
#
#
########################################################
#######################
#=============================+=========================
=======================
# AEROSOLS | Directory paths
#=============================+=========================
=======================
AEROCOM volcano emissions | $ROOT/VOLCANO/v2014-10
Bond et al BC/OC | $ROOT/BCOC_BOND/v2014-07
Cooke et al BC/OC | $ROOT/BCOC_COOKE/v2014-
07
Secondary organic aerosols | $ROOT/SOA/v2014-07
... etc ...
Each line specifies the name of a HEMCO emissions inventory and the
data path where it can be found on disk, relative to the root data path.
NOTE: The script will replace the $ROOT token with the value you gave
to the "HEMCO remote data path" above. (Lines starting with the
comment character # will be ignored.)
Any inventory found in this section will be downloaded. To prevent an
inventory from being downloaded you can either comment it out (i.e.
place a # in the first column) or move the inventory to the next section.
The final section specifies HEMCO emission inventories that you do not
wish to download. The section looks like this:
########################################################
#######################
#
#
# THE FOLLOWING DATA DIRECTORIES WILL NOT BE
DOWNLOADED. #
#
#
# These data directories contain are optional emissions
inventories that #
# are not used in typical GEOS-Chem simulations. If
you wish to download #
# any of these inventories, simply move the
corresponding entry for each #
# inventory to the previous section.
#
#
#
########################################################
#######################
hemcoDataDownload.pl
hemcoDataDownload.pl myNewConfigFile.rc
Before you start downloading GB's of data, we recommend that you run
a short test to make sure that the data directories are being copied to
the proper locations on your disk server. For this purpose, we have
provided a configuration file named forTesting.rc. Typing
hemcoDataDownload.pl forTesting.pl
will only download a couple of data inventories that do not take up much
disk space. This allows you to ensure that the data transfer is sucessful
without making you wait a long time.
--Bob Y. 13:57, 12 February 2015 (EST)
New features for the GEOS-Chem v10-01 public release
For the GEOS-Chem v10-01 public release, we modified the default
download options in the hemcoDataDownload.pl script. We changed
the default wget options from:
to
This will tell wget to only download new or modified files, instead of
trying to download the entire data archive from scratch. This should
hopefully subsequent data download processes faster.
--Bob Y. (talk) 18:35, 16 June 2015 (UTC)
Restart files
You will need a restart file before you can start your GEOS-Chem simulation.
A restart file contains the initial conditions for a GEOS-Chem simulation.
There are two restart files for GEOS-Chem:
1. GEOS-Chem restart file containing instantaneous species
concentrations (Required)
2. HEMCO restart file containing values needed for some of the HEMCO
extensions (Optional)
When you run a GEOS-Chem simulation, it will write new GEOS-Chem restart
files at the intervals you specify in input.geos. New HEMCO restart files are
written with frequency configured in HEMCO_Config.rc if HEMCO is used in your
simulation.
GEOS-Chem v11-01 run directories are configured to use initial GEOS-Chem
restart files in netCDF format. These files are available for download at:
ftp://ftp.as.harvard.edu/gcgrid/data/ExtData/SPC_RESTARTS
/
CAVEAT: The initial restart files do not reflect the actual atmospheric
state and should only be used to "spin up" the model. In other words,
they should be used as initial values in an initialization simulation to
generate more accurate initial conditions for your production runs.
Doing a one year spin up is usually sufficient; however, we recommend ten
years for ozone, carbon dioxide, and methane simulations, and four years for
radon-lead-beryllium simulations. If you are in doubt about how long your spin
up should be for your simulation, we recommend contacting the GEOS-Chem
Working Group that specializes in your area of research.
You may spin up the model starting at any year for which there is met data,
but you should always start your simulations at the month and day
corresponding to the restart file to more accurately capture seasonal variation.
If you want to start your production run at a specific date, we recommend
doing a spin up for the appropriate number of years plus the number of days
needed to reach your ultimate start date. For example, if you want to do a
production simulation starting on 12/1/13, you could spin up the model for one
year using the initial GEOS-FP restart file dated 7/1/13 and then use the new
restart file to spin up the model for five additional months, from 7/1/13 to
12/1/13.
To determine the date of a netCDF restart file, you may use ncdump For
example:
The -t option will return the time value in human-readable date-time strings
rather than numerical values in unit such as "hours since 1985-1-1 00:00:0.0."
The date of a binary punch restart file can be determined by opening the file in
GAMAP.
Using a HEMCO restart file for your initial spin up run is optional. The HEMCO
restart file contains fields for initializing variables required for Soil NOx
emissions, MEGAN biogenic emissions, and the UCX chemistry mechanism.
The HEMCO restart file that comes with a run directory may only be used for
the date and time indicated in the filename. HEMCO will automatically
recognize when a restart file is not available for the date and time required,
and in that case HEMCO will use default values to initialize those fields. You
can also force HEMCO to use the default initialization values by setting
"HEMCO_RESTART" to false in HEMCO_Config.rc. For more information, see
the HEMCO User's Guide.
You can read more about restart files at the GEOS-Chem output files wiki
page.
--Melissa Sulprizio (talk) 16:03, 12 January 2017 (UTC)
Visualization packages
In this section we provide information about software packages that you can
use to analyze and plot GEOS-Chem output.
GAMAP and other IDL software
NOTE: IDL, which is proprietary software, can be very expensive. For
this reason, the GEOS-Chem Support Team and other GEOS-Chem
developers are currently developing several open-source software
packages (mostly based on Python) for GEOS-Chem data analysis and
visualization. Please see our Python software section below.
The traditional GEOS-Chem visualization software is GAMAP. This package
was customized to GEOS-Chem and is still heavily used today. GAMAP
requires the Interactive Data Language (a proprietary package). For more
information about GAMAP, please see:
https://github.com/geoschem/geos-chem
This will create an exact copy (or clone) of the official GEOS-Chem
repository to your local disk space in a directory
named Code.GC12. Using Code.GC12 as your local repository
name is optional and you may specify a different directory name if
you wish. When you clone the source code you will always get the
most recent state of the repository, meaning the latest GEOS-Chem
version or bug fix patch.
Contents
[hide]
where LOCAL-DIR-NAME is the name of the local directory on your disk into which the GEOS-
Chem source code files will be placed. It is up to you to pick LOCAL-DIR-NAME.
For more information, please see Chapter 5: GEOS-Chem source code directory in the GEOS-
Chem Online User's Guide.
Run directories
There is a unique run directory for every combination of GEOS-Chem simulation type, grid
resolution, meteorological data source, and nested region. A collection of GEOS-Chem run
directories are stored in the GEOS-Chem Unit Tester which is available for download via Git.
More information can be found on the following wiki pages:
(1) Change into your code directory and start gitk as follows:
(2) Or if you are using the git gui GUI browser (more on that below), you can
invoke gitk from the Repository/Visualize master's History menu item.
At the top left of the gitk screen, you will see the graph of revisions. Each dot represents
a commit, along with the log message that accompanied each commit.
Note that at the most recent commit (i.e. the line at the very top) there are 2 green
boxes, one labeled master and one labeled remotes/origin/master:
Making revisions
Using the GUI browser
We recommend using the git gui for source code management. Start this in your code
directory:
Bug_fix_sulfate_mod
CO2_simulation
KPP_with_isoprene
Methane_simulation
You will be automatically placed into the branch you have just created.
Committing
With Git, you should commit frequently, such as when you have completed making
revisions to a file or group of files. Commits that are made on one branch will not affect
the other branches.
Committing is best done with the git gui. Follow these steps:
1. Pick Commit/Rescan from the menu (or type the F5 key) to force the git gui to
show the latest changes.
2. You should get a list of files in the Unstaged Changes window. Clicking on the
icons on the left of the file names will send them to the Staged
Changes window. Git will add all of the files in Staged Changes to the
repository on the next time you commit. Note: Clicking on the icon of the files in
the Staged Changes moves back the file to the Unstaged Changes window.
3. Type a Commit message in the bottom right window. See this example of a
good commit message. Some pointers are:
1. The first line should only be 50 characters or less and succinctly
describe the commit
2. Then leave a blank line
3. Then add more in-depth text that describes the commit
4. Then click on the Signed-off by button. This will add your name, email
address, and a timestamp. Note: To modify your name and email
address, edit the .gitconfig file in your home directory.
4. There are two radio buttons above the Commit message window.
1. New commit: This is the default. Assumes we are making a totally new
commit.
2. Amend last commit: If for whatever reason we need to update the last
commit message, pick this button.
5. Click on the Commit button.
If you then start the gitk viewer, your new commit should be visible.
Renaming files
In some instances you may find it necessary to rename files. For example, in GEOS-
Chem v9-01-02, we have had to rename file ending in .f to .F and .f90 to .F90. If only
the name of the file changes, then Git will recognize it as a renamed file in the repository.
To rename a file, follow these steps:
1. Change the name of the file with the Unix mv command. For example: mv
myfile.f myfile.F
2. Open the git gui. You will see the two files myfile.f and myfile.F listed in
the Unstaged Changes window.
3. Click on myfile.f and myfile.F; this will move them to the Staged
Changes window.
4. In Staged Changes you will see:
1. File myfile.f is slated to be removed (i.e. a red "X" is listed next to the
file name).
2. File myfile.F is slated to be added (i.e. a green checkmark is listed
next to the file name).
5. Add a commit message, sign off, and click Commit as described above.
6. Start the gitk browser. In the lower left window, you should see text such as:
From this point forward, file myfile.F will use the *.F file extension. However, it will still
possess the total revision history from when the file was still named myfile.f. If you
merge changes from another repository that still has myfile.f, then these changes will
be seamlessly integrated into myfile.F.
Switching between branches
Before you switch from one branch to another (aka "checking out a branch"), it is
recommended to commit any remaining unstaged files to the current open branch.
Unstaged files will remain in your working directory even after you checkout a different
branch. This can potentially lead to confusion or may cause an error message from Git.
To checkout a new branch in the git gui, go to Branch/Checkout on the menu and pick
the name of the branch you would like to switch to. The current branch name will be
displayed just below the menu at top left.
Once you have created your branch and have checked it out, then you may begin
making modifications to the source code with your favorite text editor.
We recommend keeping one open branch per new feature that you are adding into
GEOS-Chem. This lets you test each individual feature separately. After each feature
has been validated, you may merge each individual branch back into the master branch.
Merging
When you are ready to merge your changes back into the mainline master branch, then
you can follow this procedure.
<<<<<<< HEAD
! This is old source code that already exists in the branch
...
=======
! This is new source code that is being merged into the branch
...
>>>>>>> 77976da35a11db4580b80ae27e8d65caf5208086
At the top of the slug you see the string <<<<<<< HEAD followed by some source code.
This is the "old" code, i.e. the code that existed as of the last commit. A separator
line ======= then follows the source code.
Underneath the separator line, you will see the "new" source code, i.e. the code that we
are merging into the branch. This source code is followed by the text >>>>>>>
77976da35a11db4580b80ae27e8d65caf5208086. The long numeric string is the
SHA1 ID (the internal ID # used by Git) corresponding to the commit that we are merging
into the branch. Each commit has a unique SHA1 ID.
To resolve a file containing conflicts, do the following:
1. Open the file in your favorite text editor (vi, emacs, etc.)
2. Search for the word HEAD. This will take you to the location of each slug (where
conflicts exist).
3. Decide which code that you want to keep.
4. Delete the code that you do not want to keep.
5. Delete the lines <<<<<<< HEAD, =======, and >>>>>>> 7797... If you keep
these in the source code you will get compilation errors.
6. Repeat this process for each conflict that you find.
Once you have resolved the conflicts in each file, you can commit them back into the
repository.
Tagging
Git allows you to tag a particular commit with an alphanumeric string for easy reference.
This tag will allow users to just refer to the tag name using git pull.
You can add a tag via the command line:
NOTE: Tagging is something that typically only the GEOS-Chem Support Team will do.
Deleting branches
Once you have merged your changes back into the master branch, you may delete the
branch you just created. In the git gui, go to the Branch/Delete menu item. You will be
given a dialog box where you can select the name of the branch you wish to delete.
Sharing your revisions with others (and vice versa)
One of the really nice features of Git is that it can create patch files, or files which
contain a list of changes that can be imported into someone else's local Git repository.
Using patch files totally obviates the need of having to merge differences between codes
manually.
Creating a patch file to share with others
To create a patch file containing the code differences between a branch of your code
with your master branch, or since type the following text:
where BRANCH_NAME is the name of the branch that you want to compare against
the master branch.
You can also create a patch file for a given number of past commits. Typing:
will create a patch file for the last 3 commits. If you want the most recent commit then
use -1 instead, etc.
These commands will pipe the output from the git format-patch command to a file
named by you (in this case my-patch-file.diff, but you may select whatever name
you wish). You can then include the patch file as an email attachment and send it to
other GEOS-Chem users, or the GEOS-Chem Support Team.
When sending patch files to others, it is important that you specify the parent
commit (i.e. the commit that immediately precedes where the patch was made) for
your patch file.
Checking the validity of a patch file
Other users can also send you their source code revisions as patch files. If you want to
check the status of a Git patch file (i.e. what it will actually do) before you apply it to your
repository, you can use the git apply command as follows:
GeosCore/aerosol_mod.f | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)
The sample output listed above indicates that the patch contained 4 insertions and 3
deletions from only one file (aerosol_mod.f).
Note that the git apply --stat command does not apply the patch, but only shows
you the stats about what it'll do. For more detailed information about the patch, you can
open it in a text editor and examine it manually.
You can also find out if the patch will install in your Git repository, or if there will be
problems. You can also use the git apply command to do this:
git apply --check my-patch-file.diff
The most often error that is encountered is that the patch was made from an earlier
version of GEOS-Chem. In that instance the situation can usually be rectified by having
the sender of the patch do a git pull to the last-released GEOS-Chem version and
then to create the patch again.
Reading a patch file into your local repository
To ingest a patch file into your local Git repository you should first make a new branch.
Follow this procedure:
1. Pick Branch/Create from the menu (or type CTRL-N). Give your branch a
descriptive name like Updates_from_xxxx" that will serve as a mnemonic.
2. Pick Branch/Checkout from the menu (or type CTRL-O) and switch to the
branch you just created.
3. To ingest the other person's source code changes, type:
You can then test the other person's revisions in the separate branch until you are sure
they are OK. Once satisfied with the changes, you can merge them back into
the master branch as described above.
--Bob Y. 13:36, 2 October 2013 (EDT)
Invalid email address error
If you get the the following error while trying to run the command git am < their-
patch-file.diff:
which should ingest the changes from the patch file into your repository. Why the
difference? Long story short:
1. Try compiling GEOS-Chem and running for few time steps to make sure
everything is fine.
2. Check out the master branch.
3. Merge the patch branch into your master branch.
4. Delete the patch branch.
This will merge the changes from the master branch of the remote repository into
your master branch.
1. Clone GEOS-Chem
2. Open the gitk browser by typing gitk & at the command line.
3. In the top-left window of gitk, find the commit that you want to revert to.
Usually this will be denoted with a yellow tag (e.g. v8-03-01 or v8-03-01-
benchmark). However, if there are any post-release patches, be sure you
select the oldest one.
4. Right click on the commit text to open a context menu. Select Create new
branch.
5. A new dialog box will pop up asking you to name the branch.
Type Code.v8-03-01 and press OK.
6. Close gitk and open the Git GUI by typing git gui &
7. From the Git GUI dropdown menu, select Branch / Checkout, and then
pick Code.v8-03-01.
That's it! We now have two branches that represent different GEOS-Chem versions.
1. The master branch represents the state of the code as of the latest release
2. The Code.v8-03-01 branch represents the state of the code as of the v8-03-
01 release.
You can work on the v8-03-01 branch as you wish. You can create further branches
off of the v8-03-01 branch. The nice thing about this method is that you can always
revert to the latest release by just switching back to the master branch (with Branch
/ Checkout from the Git GUI dropdown menu).
You can also use this same method to check out older versions of files in any of
the GEOS-Chem run directories.
Adding a patch that was made to a previous version
Let's say you are currently working on the latest version of GEOS-Chem, and
somebody gives you a patch that they added into their own GEOS-Chem v8-03-02
code. You can add the patch into your code in such a way that the modification will
be at the head of the revision history. Here is the procedure.
3. Start the Gitk browser. From the dropdown menu in Git Gui, select:
4. In the Gitk revision, look for the parent commit of the patch in the revision history
(upper-left) window. The parent commit is the one immediately preceding the patch. If
you are not sure where the parent commit is, you can ask the person who sent you the
patch.
5. We will create a new branch from the parent commit into which the code updates will
be placed. Once you have found the parent commit, right click on the commit text to open
a context menu. Select:
6. We need to checkout (i.e. switch to) the new branch. Go back to the Git GUI. From the
dropdown menu, select:
7. Locate the patch file that contains the update you wish to add to the code. Let's
assume that this is called patch-file.diff. We will now apply this to your source code
directory:
NOTE: patch-file.diff does not have to be in the source code directory. It can be
anywhere, as long as you specify the full file path. Here we assume it’s
in ~/my_patch_directory.)
8. In the Git GUI, press the F5 key to refresh the display.
9. In the GitK browser, press the F5 key to refresh the display.
10. Now we want to merge the master branch into the patch-install-point branch. This
will bring in all of the previous commits from the master branch into the patch-install-
point branch, while keeping the new commits from the patch at the top of the revision
history. Switch to the Git Gui window and pick from the menu:
11. The merge may result in some conflicts in some source code files. A conflict is a
difference in the source code which Git cannot rectify. Most often conflicts are caused by
2 comments having the same number. Git will add the following lines to the source code:
<<<<<HEAD
... lines of existing
code ...
=====
... lines of new code
...
>>>>>
If there are conflicts, you will have to go through these manually. You can just hand-edit
the source code files with your favorite text editor. If there are no conflicts, you can skip
ahead to Step 14.
12. Once you have finished resolving all conflicts, commit the modified files to the
repository. In the Git Gui, click on each of the files in the “Unstaged Changes” window
(this will tell Git to commit them to the repository). Then click on the “Sign off” button and
click on the “Commit” button.
13. At this point you will have 2 branches. The master branch represents the pristine,
unmodified code from the remote repository. The patch-install-point branch
represents master branch plus the code from the patch that we added.
14. Test the code in patch-install-point to make sure that it is functioning properly (i.e.
run a benchmark simulation or a short test run).
15. Once you are certain that the code in patch-install-point is good, then merge patch-
install point back into the master branch. From the Git GUI dropdown menu:
gitk
1 Overview
o 1.1 Why use Git?
o 1.2 Advantages of using Git
2 Tutorials about Git
o 2.1 For beginners
o 2.2 For more advanced users
3 Using Git with GEOS-Chem and GAMAP
o 3.1 Obtaining and installing Git
o 3.2 First-time setup
o 3.3 Downloading GEOS-Chem and GAMAP
4 References
Overview
Why use Git?
GEOS–Chem model development is done in a distributed manner. Individual users from several
different institutions will download a recent GEOS–Chem version and modify it according to their
own particular research interests. When these source code modifications are deemed to be
mature, users will then submit them to the GEOS–Chem Support Team for inclusion into
the mainline "standard" model.
In the past, the GEOS–Chem source code and run directories were distributed to the user
community as a series of TARBALL (i.e. *.tar.gz) files via anonymous FTP. The advantage of this
method was that one would only have to download a single file. However, as the number of
GEOS–Chem users (and submitted source code modifications) grew, this method became
unwieldy. For example, if only a single file needed to be updated, the entire TARBALL file would
have to be regenerated. This often became a source of confusion and error.
Given the large number of user code submissions, robust source code management techniques
must be employed in order to ensure the integrity of the GEOS–Chem code. Therefore,
the GEOS-Chem Support Team has selected the Git version control software for GEOS–Chem
source code management.
Advantages of using Git
Git offers many improvements over previous source code management software such
as CVS and Subversion.
1. Git avoids some of the limitations of CVS (which is by now 20-year-old software).
Git is a distributed source code management system. Instead of having one central
GEOS–Chem repository residing on a single server, Git allows you to keep an
identical copy (a.k.a. "clone") of the GEOS–Chem source code repository on your
own system. Having several copies of the GEOS–Chem repository allows for
redundancy in case of catastrophic server failure or other such calamity.
Modifications that you make to your own repository will not affect the repositories of
other users. (That is, unless you consciously decide to "push" your changes to
another repository).
When you are ready to submit your source code modifications for inclusion into the
"standard" code, the GEOS–Chem Support Team can simply get them with a Git
"pull" operation.
Git allows you to save out your source code changes to a "patch" file (a text file with
a list of source code differences). This can be emailed to other users and applied to
their local source code repository.
git --version
then you are good to go. (The actual version # doesn't matter.) If not,
then you (or your sysadmin) may obtain the Git source code (or
binaries) the Git website.
First-time setup
Before using Git for the first time, you need to set up
your ~/.gitconfig file. Open a text editor and then cut & paste the
text from this sample .gitconfig file. Then save it as ~/.gitconfig.
Be sure to change your name and email accordingly, this is how Git will
know who you are!
Please see the following pages which describe how to download the
GEOS-Chem and GAMAP source code packages via Git.
Downloading GEOS-Chem and GAMAP
Please see the following wiki pages which contain detailed information
about how to use Git to download and modify the GEOS-Chem and
GAMAP source code packages:
ftp://ftp.as.harvard.edu/gcgrid/geos-
chem/1mo_benchmarks
Contents
[hide]
1 Overview
o 1.1 List of GEOS-Chem input files
o 1.2 Chemical mechanism files ship with the GEOS-Chem source code
2 The input.geos file
o 2.1 Simulation Menu
o 2.2 Timestep menu
o 2.3 Advected Species Menu
o 2.4 Transport Menu
o 2.5 Convection Menu
o 2.6 Emissions Menu
o 2.7 Aerosol Menu
o 2.8 Deposition Menu
o 2.9 Chemistry Menu
o 2.10 Output Menu
o 2.11 GAMAP Menu
o 2.12 Diagnostic Menu
o 2.13 Planeflight Menu
o 2.14 ND48 Menu
o 2.15 ND49 Menu
o 2.16 ND50 Menu
o 2.17 ND51 and ND51b Menus
o 2.18 ND63 Menu
o 2.19 Prod and Loss Menu
o 2.20 Benchmark Menu
o 2.21 Nested Grid Menu
o 2.22 Passive Species Menu
o 2.23 CH4 simulation menu
o 2.24 CO2 Simulation Menu
o 2.25 POPs simulation menu
o 2.26 Mercury Simulation Menu
o 2.27 Radiation Menu
3 The HEMCO_Config.rc file
o 3.1 Overview
o 3.2 Enabling and disabling emissions
4 The HEMCO_Diagn.rc file
5 The HISTORY.rc file
6 The Planeflight.dat file
7 Input files for the photolysis mechanism
Overview
This page describes the input files that are read by GEOS-Chem. These files will reside in the
various GEOS-Chem run directories. Each run directory is customized for a unique combination
of simulation, horizontal resolution, and met field type, and contains the various input files with
which you select options for your GEOS-Chem simulation.
You can generate GEOS-Chem run directories from the GEOS-Chem Unit Tester. Please see
our our Creating GEOS-Chem run directories wiki page for detailed instructions. We recommend
that you create a different run directory for each of your GEOS-Chem simulations to avoid
overwriting output with subsequent model runs.
Note that run directories compatible with previous versions of GEOS-Chem will not work with
[[GEOS-Chem v11-01]|v11-01]].
List of GEOS-Chem input files
Below is a table listing GEOS-Chem input files that reside in the run directory.
KPP/Standard
KPP/Tropchem
KPP/SOA_SVPOA
If you need to add new species or reactions, you can modify these globchem.* files and then
rebuild the KPP code. It is recommended that you place any custom modifications to the GEOS-
Chem chemistry mechanisms into this folder:
KPP/Custom
You can ask the GEOS-Chem Support Team for assistance with this.
In the very near future, we hope to build KPP fresh each time when GEOS-Chem is compiled.
This will make it much easier to change the existing chemical mechanisms.
--Bob Yantosca (talk) 18:52, 21 March 2018 (UTC)
Note: /path/to/data/ indicates the path to the root data folder on your system. If you don't
know where this is, ask your IT staff.
Nested-grid simulations
Setting up GEOS-Chem nested grid simulations
Available met data for nested grid simulations
Line numbers are not part of the input.geos file, but have been included for reference.
Mercury simulation
Global Terrestrial Mercury Model
Line numbers are not part of the input.geos file, but have been included for reference.
O3 (Ozone)
ME (Methane)
SU (Sulfate)
NI (Nitrate)
AM (Ammonium)
BC (Black carbon)
OA (Organic aerosol)
SS (Sea salt)
DU (Mineral dust)
PM (All particulate matter)
ST (Stratospheric aerosol, UCX simulation only)
--Bob Yantosca (talk) 16:45, 20 March 2018 (UTC)
If you would like to turn off NEI2011, simply change the true in this line:
to false:
And if you also wanted to turn off the MEGAN biogenic emissions, simply change the on in this
line:
to off:
etc.
--Bob Yantosca (talk) 23:08, 16 November 2016 (UTC)
#--------------------------------------------------------------------------
----
# GEOS-Chem Global Chemical Transport Model
!
#--------------------------------------------------------------------------
----
#BOP
#
# !MODULE: HEMCO_Diagn.rc
#
# !DESCRIPTION: Configuration file for netCDF diagnostic output from HEMCO.
#\\
#\\
# !REMARKS:
# Customized for the benchmark simulation.
# TO DO: Add long names, which are used for netCDF variable attributes.
#
# !REVISION HISTORY:
# 13 Feb 2018 - E. Lundgren - Initial version
#EOP
#--------------------------------------------------------------------------
----
#BOC
# Name Spec ExtNr Cat Hier Dim OutUnit LongName
###########################################################################
####
##### ACET emissions (in bpch ND11, ND28, ND34, ND36, ND46)
#####
###########################################################################
####
EmisACET_Total ACET -1 -1 -1 3 molec/cm2/s
EmisACET_Anthro ACET 0 1 -1 3 atomsC/cm2/s
EmisACET_BioBurn ACET 111 -1 -1 2 atomsC/cm2/s
EmisACET_Biofuel ACET 0 2 -1 2 atomsC/cm2/s
EmisACET_Biogenic ACET 108 -1 -1 2 atomsC/cm2/s
EmisACET_DirectBio ACET 108 -1 -1 2 atomsC/cm2/s
ACET_from_direct_emissions
EmisACET_MethylBut ACET 108 -1 -1 2 atomsC/cm2/s
ACET_from_methyl_butenol
EmisACET_Monoterp ACET 108 -1 -1 2 atomsC/cm2/s
ACET_from_monoterpenes
EmisACET_Ocean ACET 101 -1 -1 2 atomsC/cm2/s
ACET_from_ocean_source
###########################################################################
####
##### ALD2 emissions (in bpch ND28, ND34, ND36, ND46. ALD2_Ocean
#####
##### and is new)
#####
###########################################################################
####
EmisALD2_Total ALD2 -1 -1 -1 3 molec/cm2/s
EmisALD2_Anthro ALD2 0 1 -1 3 atomsC/cm2/s
EmisALD2_BioBurn ALD2 111 -1 -1 2 atomsC/cm2/s
EmisALD2_Biofuel ALD2 0 2 -1 2 atomsC/cm2/s
EmisALD2_Biogenic ALD2 108 -1 -1 2 atomsC/cm2/s
EmisALD2_Ocean ALD2 101 -1 -1 2 atomsC/cm2/s
EmisALD2_Senesc ALD2 0 4 -1 2 atomsC/cm2/s
As you can see, you can archive not only the total emissions for a given species, but also the
individual emission sectors. This gives you much more flexibility over the GEOS-Chem bpch
diagnostic output.
Once you have your HEMCO_Diagn.rc file customized for your given simulation, make sure that
you have these settings in your HEMCO_Config.rc file:
DiagnFile: HEMCO_Diagn.rc
DiagnPrefix: HEMCO_diagnostics
DiagnFreq: End
where:
This page describes the GEOS-Chem Unit Tester package that corresponds to GEOS-Chem
v11-02 (aka 12.0.0).
NOTE: If you are still using GEOS-Chem v11-01, then please see our GEOS-Chem unit tester
for v11-01 wiki page.
Contents
[hide]
1 Overview
2 Installing the GEOS-Chem Unit Tester
o 2.1 Requirements
o 2.2 Downloading the GEOS-Chem Unit Tester
2.2.1 Reverting to an older state
o 2.3 Directory structure
3 Using the GEOS-Chem Unit Tester
o 3.1 What the GEOS-Chem Unit Tester does
o 3.2 Specifying unit test options with an input file
3.2.1 The INPUTS section
3.2.2 The RUNS section
o 3.3 Running the GEOS-Chem Unit Tester
4 Examining the results
o 4.1 Output files written to each unit test run directory
o 4.2 Log files written to the unit test log directory
o 4.3 Unit test results and error messages
o 4.4 How unit tests are scored
o 4.5 Unit test results displayed as a web page
5 GEOS-Chem Unit Test Results
o 5.1 LEGEND
o 5.2 Unit test results displayed as a text file
6 Cleaning old files
Overview
The GEOS-Chem Unit Tester contains the various Makefiles, scripts, and run directories that
you will need to run GEOS-Chem. With the Unit Tester, you can:
Action Description
Create GEOS- You can use a script to create fresh copies of GEOS-Chem run
Chem run directories for each supported GEOS-Chem simulation. Each run
directories directory that you create will contain all of the necessary input files
with the proper settings. For detailed instructions, please visit
our Creating GEOS-Chem run directories wiki page.
Set up GEOS- You can create run directories that can be used to perform 7-model-
Chem 7-day day timing tests. This will allow you to quickly evaluate the
timing tests performance of GEOS-Chem on your system. For more information,
please see our Timing tests with GEOS-Chem v11-02 wiki page.
Run GEOS- An individual GEOS-Chem Unit Test will look for computational and
Chem unit numerical issues in simulations for a given combination of:
tests
1. Met field type
2. Horizontal grid
3. Simulation type
We recommend that you perform frequent unit tests as you develop
code, as this is the best way to find many common types of errors.
For more information about how to submit unit tests, please
see the Using the Unit Testersection below.
Run GEOS- A Difference Test validates that structural updates in a given GEOS-
Chem Chem "Development" (or "Dev") produce identical results when
difference compared to a "Reference" (or "Ref") version.
tests
You can use a script that will create difference test run directories for
any of the supported GEOS-Chem simulations. For more information
on how to set up GEOS-Chem difference tests, please
see our Performing Difference Tests with GEOS-Chem wiki page.
--Bob Yantosca (talk) 19:12, 20 March 2018 (UTC)
This will download a fresh copy of the GEOS-Chem-UnitTest repository into a directory
named UT, and place you in the master branch. The master branch always corresponds to the
last publicly-released GEOS-Chem version.
If you would like to check out the Unit Tester that corresponds to a specific numbered GEOS-
Chem version you can type one of the following commands:
git checkout 12.0.0
git checkout v11-02-rc
git checkout v11-02e
etc.
--Bob Yantosca (talk) 21:12, 21 June 2018 (UTC)
Reverting to an older state
If you would like to use the Unit Tester to validate an older version of GEOS-Chem, then you can
use the gitk browser to open a new branch at a past commit as follows:
1. Open the gitk browser by typing gitk & at the command line.
2. In the top-left window of gitk, find the commit that you want to revert to. Usually this will
be denoted with a yellow tag (e.g. v9-02-Provisional-Release' or v10-01b, etc.).
3. Right click with the mouse. This will open a context menu. Select Create new branch.
4. A new dialog box will pop up asking you to name the branch.
Type OLDER_BRANCH and press OK.
5. Close gitk and open the Git GUI by typing git gui &
6. From the Git GUI dropdown menu, select Branch / Checkout, and then
pick OLDER_BRANCH.
That's it! We now have two branches that represent different GEOS-Chem versions.
Directory Description
logs/ Log files containing output from the GEOS-Chem Unit Test simulations will
be sent here.
perl/ Contains the Perl scripts that are used to submit GEOS-Chem Unit Test
simulations.
1. Run GEOS-Chem with strict debugging flags with OpenMP parallelization turned OFF
This is called the single processor or sp test phase.
2. Run GEOS-Chem with strict debugging flags with OpenMP parallelization turned ON
This is called the multi processor or mp test phase.
The strict debugging flags will look for common computational and numerical issues, such as
#--------------------------------------------------------------------------
----
# GEOS-Chem Global Chemical Transport Model
!
#--------------------------------------------------------------------------
----
#BOP
#
# !INCLUDE: UnitTest.input
#
# !DESCRIPTION: Input file that specifies debugging options for the
# GEOS-Chem unit tester.
#\\
#\\
# !REMARKS:
# To omit individual unit tests (or input settings), place a # comment
# character in the first column.
#
# For a complete explanation of how to customize the settings below for
# your installation, please see these wiki posts:
#
# http://wiki.geos-chem.org/GEOS-Chem_Unit_Tester#The_INPUTS_section
# http://wiki.geos-chem.org/GEOS-Chem_Unit_Tester#The_RUNS_section
#
# !REVISION HISTORY:
# Type 'gitk' at the prompt to browse the revision history.
#EOP
#--------------------------------------------------------------------------
----
#
# !INPUTS:
#
# %%% ID tags %%%
#
VERSION : v11-02
DESCRIPTION : Tests GEOS-Chem v11-02
#
# %%% Data path and HEMCO settings %%%
#
DATA_ROOT : /n/holylfs/EXTERNAL_REPOS/GEOS-
CHEM/gcgrid/data/ExtData
HEMCO_ROOT : {DATAROOT}/HEMCO
VERBOSE : 3
WARNINGS : 3
#
# %%% Code and queue settings %%%
#
CODE_DIR : {HOME}/GC/Code.v11-02
MAKE_CMD : make -j4 BOUNDS=y DEBUG=y FPEX=y NO_ISO=y BPCH_DIAG=y
NC_DIAG=n
SUBMIT : sbatch
#
# %%% Unit tester path names %%%
#
UNIT_TEST_ROOT : {HOME}/UT
RUN_ROOT : {UTROOT}/runs
RUN_DIR : {RUNROOT}/{RUNDIR}
JOB_DIR : {UTROOT}/jobs
LOG_DIR : {UTROOT}/logs/{VERSION}
PERL_DIR : {UTROOT}/perl
#
# %%% Web and text display options %%%
#
TEMPLATE : {PERLDIR}/ut_template.html
TXT_GRID : {LOGDIR}/{VERSION}.results.txt
WEB_GRID : {LOGDIR}/{VERSION}.results.html
WEB_PUSH : NONE
#
# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
# %%% OPTIONAL COMMANDS: %%%
# %%% %%%
# %%% If your system uses the SLURM scheduler, then you can %%%
# %%% define several #SBATCH tags that will be added to the top %%%
# %%% of the job script. Otherwise you can comment these lines %%%
# %%% out with the # comment character. %%%
# %%% %%%
# %%% NOTE: If you do use these SLURM commands, then also be %%%
# %%% sure to set the SUBMIT command above to "sbatch". %%%
# %%% %%%
# %%% The INIT_COMMANDS tag will let you specify any optional %%%
# %%% initialization commands that will be placed at the top %%%
# %%% of the job script. For example, if your system requires %%%
# %%% that modules need to be loaded from within the job script, %%%
# %%% you can specify those instructions here. %%%
# %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
#
SLURM_CPUS : 4
SLURM_NODES : 1
SLURM_TIME : 1-12:00
SLURM_MEMORY : 35000
SLURM_PARTITION : huce_intel
SLURM_CONSTRAINT : broadwell
SLURM_MAILTYPE : END
VERSION An ID tag that will be added to all log files and output files.
HEMCO_ROOT Specifies the top-level path for the HEMCO data directory tree.
The {DATAROOT} token in HEMCO_ROOT will be replaced with the
value you specify for RUN_ROOT option. Leave this as-is.
VERBOSE Specifies the level of debug output that will be sent to the
HEMCO log file. (0=no debug output; 3=max debug output)
Recommended setting: 0
WARNINGS Specifies the level of warning messages that will be sent to the
HEMCO log file. (0=no warnings; 3=max warnings)
Recommended setting: 1
CODE_DIR Specifies the path of the source code directory.
MAKE_CMD Specifies the make command for your system. Usually this
is make, but on some systems this may be gmake. Recommended
options are:
JOB_DIR Specifies the directory where the unit test job script will be
created.
LOG_DIR Specifies the directory where log files from this unit test
simulation will be sent.
SLURM_CPUS Specifies the number of CPUs that will be used for the unit test
(with #SBATCH -c)
SLURM_NODES Specifies the number of nodes that will be used for the unit test
(with #SBATCH -n)
SLURM_MEM Specifies the total amount of memory (in MB) needed to run the
unit tests.
SLURM_CONSTRAINT Allows you to restrict the unit test job to CPUs of a specific type
(with #SBATCH --constraint=. For example
The most common option is END, which notifies you when the
unit test job ends. You can also use ALL which will send you
an email when the job starts, ends, and if it dies prematurely.
SLURM_MAILUSER Specifies the email address (with #SBATCH --mail-user=) where
notifications from SLURM will be sent.
SLURM_STDOUT Specifies the file that contains the redirected "stdout" stream
(i.e. echo-back of commands to the screen).
NOTE: If the unit test fails, you should check this file first to
see what the specific error messages were.
INIT_COMMANDS Specify can specify any additional commands that are specific to
your computer system. For example, you might need to source a
.bashrc file, or to issue a command to load modules. You can
leave this commented out otherwise.
# !RUNS:
# Specify the debugging runs that you want to perform below.
# You can deactivate runs by commenting them out with "#".
#
#--------|-----------|------|----------------|------------|--------------|-
----|
# MET | GRID | NEST | SIMULATION | START DATE | END DATE
|EXTRA|
#--------|-----------|------|----------------|------------|--------------|-
----|
geosfp 4x5 - benchmark 2016070100 201607010020
-
geosfp 4x5 - RnPbBe 2016010100 201601010020
-
geosfp 4x5 - Hg 2016010100 201601010020
-
geosfp 4x5 - tagHg 2016010100 201601010020
-
geosfp 4x5 - POPs 2016070100 201607010020
-
geosfp 4x5 - CH4 2016070100 201607010020
-
geosfp 025x03125 na CH4 2016070100 201607010010
-
geosfp 4x5 - tagO3 2016070100 201607010020
-
geosfp 4x5 - tagCO 2016070100 201607010020
-
geosfp 2x25 - CO2 2016070100 201607010020
-
geosfp 4x5 - aerosol 2016070100 201607010020
-
geosfp 4x5 - standard 2016070100 201607010020
-
geosfp 4x5 - tropchem 2016070100 201607010020
-
geosfp 4x5 - complexSOA 2016070100 201607010020
-
geosfp 4x5 - complexSOA_SVPOA 2016070100 201607010020
-
geosfp 4x5 - aciduptake 2016070100 201607010020
-
geosfp 4x5 - marinePOA 2016070100 201607010020
-
geosfp 4x5 - TOMAS15 2016070100 201607010030
-
geosfp 4x5 - TOMAS40 2016070100 201607010030
-
geosfp 4x5 - RRTMG 2016070100 201607010040
-
geosfp 025x03125 ch tropchem 2016070100 201607010010
-
geosfp 025x03125 na tropchem 2016070100 201607010010
-
!END OF RUNS:
#EOP
#--------------------------------------------------------------------------
----
NOTE: For clarity, only simulations using GEOS-FP meteorology have been shown above. You
can also schedule simulations using MERRA-2 meteorology.
Options:
MET
Specifies the met field type. Allowable values are:
GRID
Specifies the horizontal grid. Allowable values are:
Value Grid type
05x0666 Selects the GMAO 0.5° x 0.666° grid for use with GEOS-5 met
fields only.
(You must also specify a value for NEST.)
05x0625 Selects the GMAO 0.5° x 0.625° grid for use with MERRA-2 met
fields only.
(You must also specify a value for NEST.)
025x03125 Selects the GMAO 0.25° x 0.3125° grid for use with GEOS-FP met
fields only.
(You must also specify a value for NEST.)
NEST
Specifies the nested grid. May only be used if the value
for GRID is 05x0625 or 025x03125. Allowable values are:
- Skips the nested grid option (default). Use this for global
simulations.
as Selects the Asia (AS) nested grid option for use with MERRA-2 met
fields only.
ch Selects the China (CH) nested grid option for use with GEOS-
FP met fields only.
SIMULATION
Specifies the simulation type. Allowable values are:
Value Simulation
Full-chemistry simulations
Specialty simulations
START DATE
Specifies the start date (YYYYMMDD) and hour (hh) of the unit test.
END DATE
Specifies the ending date (YYYYMMDD), hour (hh), and minutes (mm) of the unit test.
1. We typically run
from 2016070100 to 2016070101 for GEOS-
FP and MERRA-2 met fields.
2. The actual year for the met fields does not matter so much
for the unit tests. The unit tests runs the same simulation
twice (with and without parallelization) and then tests for
identical results.
--Bob Yantosca (talk) 14:35, 21 March 2018 (UTC)
Running the GEOS-Chem Unit Tester
Once you have added your desired options to the input file, you
may then proceed to start the GEOS-Chem Unit Tester. At the Unix
prompt, type:
cd perl
./gcUnitTest FILENAME
Descriptio
File Notes
n
trac.avg.*.mp Binary
punch
(bpch)
format file
containing
diagnostic
output
from the
the multi-
processor
test.
More
information
on this file is
contained
below.
More
information
on this file is
contained
below.
More
information
on this file is
contained
below.
{VERSION}.{MET}_{GRID}_{SIM}. GEOS-Chem
log.sp
log file
(Global simulations)
output from
{VERSION}.{MET}_{GRID}_{SIM}_ the single-
{NEST}.log.sp processor
(Nested-grid simulations) test phase of
the unit test
for the given
met field,
grid,
simulation
type (and, if
applicable,
nested-grid
region).
{VERSION}.{MET}_{GRID}_{SIM}. GEOS-Chem
log.mp
log file
(Global simulations)
output from
{VERSION}.{MET}_{GRID}_{SIM}_ the multi-
{NEST}.log.mp processor
(Nested-grid simulations) test phase of
the unit test
for the given
met field,
grid,
simulation
type (and, if
applicable,
nested-grid
region).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%
%%% GEOS-CHEM UNIT TEST RESULTS FOR VERSION:
BpchOnly_Mar19
%%% job sent to queue @ 2018/03/19 14:17:43
%%%
%%% DESCRIPTION: c40ec4d Restructure calculation of
binned dust AOD diag to avoid seg fault in GCHP
%%%
%%% BUILT WITH: make -j8 BOUNDS=y DEBUG=y FPEX=y
NO_ISO=y BPCH_DIAG=y NC_DIAG=n
%%%
%%% This is the main log file, which shows output
from
%%% each individual phase of the unit test
sequence.
%%%
%%% Log files from individual unit-test runs are
also
%%% stored in this same directory.
%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%
####################################################
###########################
### VALIDATION OF GEOS-CHEM OUTPUT FILES
### Run ID: geosfp_4x5_RnPbBe
##@
### IDENTICAL :
GEOSChem_restart.201607010020.nc.{sp,mp}
### IDENTICAL :
HEMCO_diagnostics.201607010000.nc.{sp,mp}
### IDENTICAL :
HEMCO_restart.201607010020.nc.{sp,mp}
### IDENTICAL :
trac_avg.geosfp_4x5_RnPbBe.201607010000.{sp,mp}
##%
####################################################
###########################
... etc ...
If, on the other hand, you compiled with NC_DIAG=y then your
results file will look similar to this:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%
%%% GEOS-CHEM UNIT TEST RESULTS FOR VERSION:
Nc_Mar19
%%% job sent to queue @ 2018/03/19 14:17:43
%%%
%%% DESCRIPTION: c40ec4d Restructure calculation of
binned dust AOD diag to avoid seg fault in GCHP
%%%
%%% BUILT WITH: make -j8 BOUNDS=y DEBUG=y FPEX=y
NO_ISO=y BPCH_DIAG=n NC_DIAG=y
%%%
%%% This is the main log file, which shows output
from
%%% each individual phase of the unit test
sequence.
%%%
%%% Log files from individual unit-test runs are
also
%%% stored in this same directory.
%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%
####################################################
###########################
### VALIDATION OF GEOS-CHEM OUTPUT FILES
### Run ID: geosfp_4x5_RnPbBe
##@
### IDENTICAL :
GEOSChem.CloudConvFlux.20130701_0000z.nc4.{sp,mp}
### IDENTICAL :
GEOSChem.DryDep.20130701_0000z.nc4.{sp,mp}
### IDENTICAL :
GEOSChem.LevelEdgeDiags.20130701_0000z.nc4.{sp,mp}
### IDENTICAL :
GEOSChem.RadioNuclide.20130701_0000z.nc4.{sp,mp}
### IDENTICAL :
GEOSChem.SpeciesConc.20130701_0000z.nc4.{sp,mp}
### IDENTICAL :
GEOSChem.StateMet.20130701_0000z.nc4.{sp,mp}
### IDENTICAL :
GEOSChem.WetLossConv.20130701_0000z.nc4.{sp,mp}
### IDENTICAL :
GEOSChem.WetLossLS.20130701_0000z.nc4.{sp,mp}
### IDENTICAL :
GEOSChem_restart.201307010020.nc.{sp,mp}
### IDENTICAL :
HEMCO_diagnostics.201307010000.nc.{sp,mp}
### IDENTICAL :
HEMCO_restart.201307010020.nc.{sp,mp}
##%
####################################################
###########################
Outcome Description
Result Criteria
PASS All corresponding pairs of {*.sp, *.mp} restart files are
IDENTICAL, and
All corresponding pairs of {*.sp, *.mp} diagnostics
files are IDENTICAL
PASS WITH All corresponding pairs of {*.sp, *.mp} restart files are
WARNINGS IDENTICAL, but
One or more corresponding pairs of {*.sp, *.mp}
diagnostics files are DIFFERENT
FAIL One or more corresponding pairs of {*.sp, *.mp}
restart files are DIFFERENT
Any *.sp or *.mp restart file is MISSING
Any *.sp or *.mp diagnostics file is MISSING
co R
T
St mp A M n A
ro co R
Uni a lex ci ari - e TO TO
p mp R Ta P Ta Ta C C
t n- SO d ne P H r- M M
c lex T gH O gC gO H O
Tes d A- U P b g o AS AS
h SO M g Ps O 3 4 2
ts ar SV pt O - s 15 40
e A G
d PO k A B ol
m
A e
ME
RR
A-2
@
4° x
5°
GE
OS-
FP
@
4° x
5°
ME
RR
A-2
@
2° x
2.5°
GE
OS-
FP
@
2° x
2.5°
Benchmark Bench
Unit Tests mark
GEOS-FP @ 4° x 5°
Nested-
Trop
LEGEND
Grid CH4 Hg
chem The unit test was successful.
Unit Tests
GEOS-FP Further investigation is necessary (e.g.
NA @ The restart files were identical but the
0.25° x diagnostic output differed).
0.3125°
The unit test failed.
GEOS-FP
CH @ A unit test was not performed for this
0.25° x combination of met field, horizontal
0.3125° grid, and simulation type.
MERRA-2
NA @
0.5° x
0.625°
MERRA-2
AS @
0.5° x
0.625°
cd perl
./cleanFiles
cd UT/perl
In this directory there is a Perl script named gcCopyRunDirs that you will use to
generate fresh copies of GEOS-Chem run directories. This script uses an
input file named CopyRunDirs.input, which is also located in the perl directory.
IMPORTANT: The CopyRunDirs.input file that comes with the Unit Tester
provides only a small sample of the possible run directories that you can
create. You can add as many entries to the RUNS section as you wish. See
the UnitTest.input file for a complete list of all possible run directories that you
can add to CopyRunDirs.input.
Your CopyRunDirs.input file will look something like this:
#--------------------------------------------------------
----------------------
# GEOS-Chem Global Chemical Transport
Model !
#--------------------------------------------------------
----------------------
#BOP
#
# !DESCRIPTION: Input file that specifies configuration
for creating and
# copying a run directory from the UnitTester.
#\\
#\\
# !REMARKS:
# For a complete description of how to customize the
settings in the
# INPUTS and RUNS sections, see the following wiki
posts:
#
# wiki.geos-chem.org/Creating_GEOS-
Chem_run_directories#Section_1:_INPUTS
# wiki.geos-chem.org/Creating_GEOS-
Chem_run_directories#Section_2:_RUNS
#
# !REVISION HISTORY:
# 18 Mar 2015 - R. Yantosca - Initial version
# 19 Mar 2015 - E. Lundgren - Simplify content for only
copying run dirs
# 19 May 2015 - R. Yantosca - Now can specify VERBOSE
and WARNINGS options
#EOP
#--------------------------------------------------------
----------------------
#
# !INPUTS:
#
# %%% ID tags %%%
#
VERSION : v11-01
DESCRIPTION : Create run directory from UnitTest
#
# %%% Data path and HEMCO settings %%%
#
DATA_ROOT : /n/holylfs/EXTERNAL_REPOS/GEOS-
CHEM/gcgrid/data/ExtData
HEMCO_ROOT : {DATAROOT}/HEMCO
VERBOSE : 0
WARNINGS : 1
#
# %%% Unit tester path names %%%
#
UNIT_TEST_ROOT : {HOME}/UT
RUN_ROOT : {UTROOT}/runs
RUN_DIR : {RUNROOT}/{RUNDIR}
PERL_DIR : {UTROOT}/perl
#
# %%% Target directory and copy command %%%
#
COPY_PATH : {HOME}/GC/rundirs
COPY_CMD : cp -rfL
#
# !RUNS:
# Specify the runs directories that you want to copy
below.
# Here we provide a few examples, but you may copy
additional entries from
# UnitTest.input and modify the dates as needed. You can
deactivate copying
# run certain directories by commenting them out with
"#".
#
#--------|-----------|------|------------|------------|--
----------|---------|
# MET | GRID | NEST | SIMULATION | START DATE |
END DATE | EXTRA? |
#--------|-----------|------|------------|------------|--
----------|---------|
geosfp 4x5 - standard 2013070100
2013080100 -
# geosfp 4x5 - gc_timing 2013070100
2013070800 -
# geosfp 4x5 - tropchem 2013070100
2013070101 -
# geosfp 4x5 - soa 2013070100
2013070101 -
# geosfp 4x5 - soa_svpoa 2013070100
2013070101 -
# geosfp 4x5 - aciduptake 2013070100
2013070101 -
# geosfp 4x5 - UCX 2013070100
2013070101 -
# geosfp 4x5 - RRTMG 2013070100
2013070101 -
# geosfp 4x5 - RnPbBe 2013070100
2013070101 -
# geosfp 4x5 - Hg 2013010100
2013010101 -
# geosfp 4x5 - POPs 2013070100
2013070101 -
# geosfp 4x5 - TOMAS40 2013070100
2013070101 -
# geosfp 4x5 - CH4 2013070100
2013070101 -
# geosfp 4x5 - tagO3 2013070100
2013070101 -
# geosfp 4x5 - tagCO 2013070100
2013070101 -
# geosfp 2x25 - CO2 2013070100
201307010030 -
# geosfp 4x5 - aerosol 2013070100
2013070101 -
# geosfp 025x03125 ch tropchem 2013070100
201307010010 -
# geosfp 025x03125 na tropchem 2013070100
201307010010 -
# gchp 4x5 - tropchem 2013070100
2013070101 -
!END OF RUNS:
#EOP
#--------------------------------------------------------
----------------------
Option Description
VERSION An ID tag that will be added to all log files and output files.
DESCRIPTION A short (1-sentence) description of the purpose of this specific
file (optional). This may be used to differentiate different input
files, such as if you pre-configure several for future re-use.
DATA_ROOT Specifies the path for your root-level data directory.
HEMCO_ROOT Specifies the top-level path for the HEMCO data directory
tree.
VERBOSE Specifies the level of debug output that will be sent to the
HEMCO log file. (0=no debug output; 3=max debug output)
Recommended setting: 0
WARNINGS Specifies the level of warning messages that will be sent to
the HEMCO log file. (0=no warnings; 3=max warnings)
Recommended setting: 1
UNIT_TEST_ROOT Specifies the path to the GEOS-Chem Unit Tester.
RUN_ROOT Specifies the top-level unit test run directories.
Leave this as-is.
PERL_DIR Specifies the directory where the unit test Perl scripts are
found.
Leave this as-is.
COPY_PATH Specifies the directory on your disk server where copies of the
GEOS-Chem run directories will be created.
COPY_CMD Specifies the command used to copy run directories from the
GEOS-Chem Unit Tester to COPY_PATH.
Section 2: RUNS
The layout of the RUNS section is identical to the RUNS section in the GEOS-
Chem Unit Tester input file. This enables copying and pasting simulation
settings text from UnitTest.input into CopyRunDirs.input.
For example, the following line:
#
# !RUNS:
# Specify the debugging runs that you want to perform
below.
# You can deactivate runs by commenting them out with
"#".
#
#--------|-----------|------|------------|------------|--
----------|---------|
# MET | GRID | NEST | SIMULATION | START DATE |
END DATE | EXTRA? |
#--------|-----------|------|------------|------------|--
----------|---------|
geosfp 4x5 - standard 2013070100
2013080100 -
will tell the gcCopyRunDirs script to create a run directory for a GEOS-Chem
simulation with the following settings:
./gcCopyRunDirs
If you do not pass a file name to gcCopyRunDirs, then the gcCopyRunDirs script
will use the CopyRunDirs.input file that you just modified.
If you wish, you can create many customized copies of CopyRunDirs.input. For
example, suppose you edit CopyRunDirs.input to generate a full-chemistry run
directory. You can then save it as a separate file and use it explicitly
with gcCopyRunDirs.
cp CopyRunDirs.input CopyRunDirs.fullchem # Input file
set up to only copy the full-chemistry run directories
gcCopyRunDirs CopyRunDirs.Hg
ls -1 # Get directory
listing
brc.dat
dust.dat
FJX_j2j.dat
FJX_spec.dat
getRunInfo
h2so4.dat
HEMCO_Config.rc
HEMCO_restart.201307010000.nc
input.geos
jv_spec_mie.dat
Makefile
org.dat
output/
README
so4.dat
soot.dat
ssa.dat
ssc.dat
v11-01.run
validate
NOTE: Run directories for other simulations may contain other files not
pictured here.
The input.geos and HEMCO_Config.rc files have been customized for this
particular simulation. They were created from the corresponding template
files input.geos.template and HEMCO_Config.template in the Unit Tester. The
Perl script getRunInfo is used by the Makefile to extract information about the
simulation from input.geos. HEMCO and tracer restart files are also included
in every run directory but care must be taken when using them. See the
section below for more information about restart files.
Make sure to check the start and end date so that your simulation will run for
July 2013. You can then create the GEOS-Chem benchmark run directory by
typing ./gcCopyRunDirs. Navigate to your the newly
created geosfp_4x5_benchmark run directory. To compile and run your
benchmark simulation, type:
make -j4 mp
That will create a geos.mp executable file that you can use to submit your
GEOS-Chem benchmark simulation to a queue system.
NOTE: If you are compiling GEOS-Chem within the code directory, and
not within a run directory created from the GEOS-Chem Unit Tester, you
will need to pass the UCX=y option in your make command.
GEOS-Chem Online User's Guide
Previous | Next | Printable View (no frames)
7. Compiling GEOS-Chem
You can find more information about GNU Make at the GNU
Operating System website: https://www.gnu.org/software/make/.
Contents
[hide]
1 Overview
2 Directory Structure
o 2.1 GEOS-Chem 12
o 2.2 GEOS-Chem v11-02
o 2.3 GEOS-Chem v11-01
3 Compilation sequence
4 Compiling GEOS-Chem
o 4.1 Setting the proper environment variables
o 4.2 Ways to compile GEOS-Chem
o 4.3 Compiling in the top-level code directory
4.3.1 Makefile Options
4.3.2 Specifying compiler flags
4.3.3 Setting default flags
4.3.4 Compiling examples
o 4.4 Compiling in a run directory
4.4.1 Setting up the run directory Makefile
4.4.2 Makefile options
4.4.3 Information from the last time GEOS-Chem was compiled
o 4.5 Advanced compilation techniques
4.5.1 Compiling with Unix shell scripts
5 Technical Issues
o 5.1 GNU Make is required
o 5.2 Install LaTeX utilities for auto documentation
o 5.3 Compile-time options that can slow down GEOS-Chem
6 Previous issues that have now been resolved
o 6.1 Add a more robust test for netCDF-Fortran in Makefile_header.mk
o 6.2 Removed the COMPILER variable from Makefile_header.mk for a
cleaner build sequence
o 6.3 Bug fix: Specifying NO_REDUCED=no now compiles GEOS-Chem for
reduced grids
o 6.4 TRACEBACK=y is now the default setting
o 6.5 Compiler cannot find certain files
7 References
8 Obsolete information
o 8.1 GEOS-Chem v10-01
o 8.2 GEOS-Chem v9-02
o 8.3 Backup files not excluded from compilation
Overview
Starting with GEOS-Chem v8-02-03, we have modified the directory structure of GEOS-Chem.
Rather than keeping all source code files in a single directory, we now have partitioned source
code files into several subdirectories with most subdirectories having their own Makefile. This
was done for the following reasons:
1. To facilitate the installation of 3rd-party software packages such as:
HEMCO
RRTMG radiative transfer model
KPP chemical solver
Aerosol microphysics codes (e.g. APM)
ISORROPIA II
Terrestrial models (e.g. CASA, GTMM)
into GEOS-Chem. Our guiding principle is that all 3rd-party software packages should be
cleanly separable from the main-line GEOS-Chem code. This will allow the 3rd-party
software packages to be updated without having an impact on the rest of the GEOS-
Chem source code files.
2. To simplify the maintenance of the GEOS–Chem code files. Without subdirectories,
there would have been hundreds of source code files in a single directory, and it would
have been very difficult to keep track of them all.
Directory Structure
Here we list the directory structure for recent model versions.
GEOS-Chem 12
The table below lists the directory structure in GEOS-Chem 12 along with
descriptions of each subdirectory and its Makefile (if one is present).
NOTE: In previous
versions, these files
were header files
that were inlined
via
the #include state
ment. These have
since been
converted to F90
modules in order to
facilitate GEOS-
Chem
HP development.
Code/History Directory containing Yes, it compiles the code
module files to archive in History and creates
diagnostics from GEOS- library
Chem "Classic" file lib/libHistory.a
simulations to netCDF
file format
isoropiaII_main_m
od.F
NOTE: In previous
versions, these
files were header
files that were
inlined via
the #include state
ment. These have
since been
converted to F90
modules in order
to facilitate GEOS-
Chem
HP development.
Code/ISOROPIA Directory containing Yes, it compiles the code
the in ISOROPIA and creates
unmodified ISORROPI library
A II source code files file lib/libIsoropia.a.
from Thanos Nenes
and Havala Pye:
isoropiaIIcode.F
isrpia.inc
Compilation sequence
GEOS-Chem's makefiles compile source code files in the following order:
1. Files in NcdfUtil/
2. Files in KPP (first pass)
3. Files in Headers/
4. Files in GeosUtil/
5. Files in KPP/ (second pass)
6. Files in History/
7. Files in HEMCO/Core/
8. Files in HEMCO/Extensions/
9. Files in HEMCO/Interfaces/
10. Files in ISOROPIA/
11. Files in GeosRad/ (if RRTMG is enabled)
12. Files in GeosCore/
Each of theses directories has its own makefile that contains a list of all of the
source code files within the directory, plus dependent routines (a.k.a. the
"dependencies" list). Source code files that do not refer to any modules have a
dependencies list such as:
dao_mod.o : dao_mod.F
Whereas a source code file that refers to several modules (in this
case, GeosCore/chemistry_mod.F) has a dependencies list like this:
chemistry_mod.o : chemistry_mod.F90
\
aerosol_mod.o
isoropiaII_mod.o \
c2h6_mod.o
carbon_mod.o \
dust_mod.o
drydep_mod.o \
global_ch4_mod.o
mercury_mod.o \
pops_mod.o
\
rpmares_mod.o
RnPbBe_mod.o \
seasalt_mod.o
strat_chem_mod.o \
sulfate_mod.o
tagged_co_mod.o \
tagged_o3_mod.o tomas_mod.o
\
flexchem_mod.o
Compiling GEOS-Chem
In this section we provide information about how to compile the GEOS-Chem source
code into an executable file that you can run.
Setting the proper environment variables
Before you compile GEOS-Chem, please take a moment to make sure that you have
defined the proper Unix environment variables that tell GEOS-Chem which compiler
you are using, and where the netCDF library has been installed. For complete
information, please see our our Setting Unix environment variables for GEOS-
Chem wiki page.
--Bob Yantosca (talk) 16:14, 10 March 2017 (UTC)
Ways to compile GEOS-Chem
There are two basic ways to compile GEOS-Chem:
make help
at the Unix prompt. You will then get a screen similar to this:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% GEOS-Chem Help Screen %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
--------------------------------------------------------
TARGET may be one of the following:
--------------------------------------------------------
all Default target (synonym for "lib exe")
lib Builds GEOS-Chem source code
libcore Builds GEOS-Chem objs & libs only in GeosCore/
libheaders Builds GEOS-Chem objs & libs only in Headers/
libhistory Builds GEOS-Chem objs & libs only in History/
libiso Builds GEOS-Chem objs & libs only in ISOROPIA/
libkpp Builds GEOS-Chem objs & libs only in KPP/
libnc Builds GEOS-Chem objs & libs only in NcdfUtil/
librad Builds GEOS-Chem objs & libs only in GeosRad/
libutil Builds GEOS-Chem objs & libs only in GeosUtil/
ncdfcheck Determines if the netCDF library installation
works
exe Creates GEOS-Chem executable
clean Removes *.o, *.mod files in source code subdirs
only
realclean Removes all *.o, *mod, *.lib *.a, *.tex, *ps,
*pdf files everywhere
distclean Synonym for "make realclean"
doc Builds GEOS-Chem documentation (*.ps, *.pdf) in
doc/
docclean Removes *.tex, *.pdf, *,ps from doc/
tauclean Removes *.pdb, *.inst, *.pp, and *.continue.*
files produced by TAU
help Displays this help screen
--------------------------------------------------------
REQUIRED-FLAGS include:
--------------------------------------------------------
MET=____ Specifies the met field type
--> Options: geosfp merra2
--------------------------------------------------------
OPTIONAL-FLAGS may be one or more of the following:
--------------------------------------------------------
Chemistry options:
------------------
CHEM=___ Specifies which chemistry mechanism is used
--> Options: Standard Tropchem UCX SOA SOA_SVPOA
--> Default: Standard
Diagnostics options:
--------------------
BPCH_DIAG=n Disable binary-punch output for diagnostics
--> Default: BPCH_DIAG=y
Debugging flags:
----------------
BOUNDS=y Turns on subscript-array checking
--> Default: BOUNDS=n
Meteorological fields
Horizontal and vertical grid information
Compiler type
Chemistry mechanism
Options for special simulations (e.g. mercury)
Debugging options
Switches are set with Makefile options from the command line by specifying compiler
flags (e.g. UCX=y). Options that are not relevant to your system will be ignored.
Type make help at the command line while in the top-level GEOS-Chem source
code directory for a complete list of compiler flag options (excerpted above).
Please keep the following items in mind when specifying compiler flags.
When starting with a new version of GEOS-Chem or a new simulation type you
should always issue the following command:
make realclean
This is especially important if you are changing the met field type or horizontal grid
settings. The command will remove all previously-created compiler output files (e.g. *.a,
*.o, *.mod) and executables. After doing make realclean, you can recompile GEOS–
Chem again with your new options.
You can speed up compilation by specifying make -j4 to compile four files
at a time. If you have more CPUs available you can change this number.
You must specify a value for MET. Please see out Overview of GMAO met
data products wiki page for more information on the meteorology fields.
Note that the Makefile in run directories copied from the Unit Tester
automatically extract MET from input.geos and do not need to be passed in
the make command.
Allowable values are: merra2, merrageosfp, geos5, geos4, and gcap
You must specify a value for GRID. Please see Appendix 2 of the GEOS-
Chem User's Guide for more information on the horizontal grids. Note that
the Makefile in run directories copied from the Unit Tester automatically
extract GRID from input.geos and do not need to be passed in the make
command.
Allowable values are: 4x5, 2x25, 05x0666, 025x03125
If you select GRID=05x0666 or GRID=025x03125, then you must select
a nested-grid option. Note that the Makefile in run directories copied from
the Unit Tester automatically extract NEST from input.geos and do not need
to be passed in the make command. Options for NEST include:
NEST=as for the Asia nested grid (MERRA-2 only)
NEST=ch for the China nested grid
NEST=na for the North America nested grid
NEST=eu for the Europe nested grid
Makefile target names are case-sensitive. For example, you should
type make all, not make ALL.
Makefile flags will accept case-insensitive output. You may omit dashes
from met field names. Each of the following is acceptable:
MET=geosfp, MET=GEOSFP, MET=GeosFp, MET=geos-fp, etc.
Makefile flags that require a simple yes/no answer will accept case-
insensitive input. For example:
DEBUG=yes, DEBUG=Yes, DEBUG=y, DEBUG=No, DEBUG=NO, etc.
Some debug compiler flags may slow down your simulation, so we
recommend using them for short test simulations only. For more information,
see this wiki post.
--Lizzie Lundgren 14:31, 15 April 2015 (EDT)
--Bob Yantosca (talk) 18:59, 16 May 2018 (UTC)
Setting default flags
The REQUIRED-FLAGS and OPTIONAL-FLAGS can be specified in one of two
ways:
For example, if you wanted to build the GEOS-Chem executable using the KPP solver
with Rosenbrock integrator and the SOA mechanism you could type:
2. As an environment variable
If you don't wish to keep typing KPPSOLVER=rosenbrock CHEM=SOA every time, you
could instead use:
Specifying options as environment variables allows you to predefine settings so that you
don't have to physically type them on the command line every time you build GEOS-
Chem. The environment variables can also be placed in your ~/.cshrc or .bashrc so
that they are initialized when you log in.
--Melissa Sulprizio 18:04, 7 April 2015 (EDT)
Compiling examples
All examples below are for compiling from the top-level GEOS-
Chem source code directory. Note that starting in v11-01, we
recommend that users compile from within the run directory by
using the Makefile that comes with all run directories copied from
the GEOS-Chem Unit Tester. The run directory router Makefile
invokes the top-level code directory Makefile to execute any of the
commands in the examples below but creates and stores log files
locally in the run directory. See Compile with a run-directory
Makefile section below for instructions.
Example of compiling with Make:
(1) To build GEOS-Chem executable for the 4x5 grid with GEOS-
FP meteorology, one can simply type:
(7) If you wish to compile the code but not make the executable,
you may type:
make doc
(10) To remove all *.o , *.mod , and executable files from the
source code directories (but not from the mod , lib ,
or bin directories), type:
make clean
(11) And to remove everything and start over from scratch, type:
make realclean
make help
at the Unix prompt. You will then get a screen similar to this:
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%
%%% GEOS-Chem Run Directory Makefile Help Screen
%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%
----------------------------------------------------
------
TARGET may be one of the following:
----------------------------------------------------
------
all Default target (synonym for
"unittest")
...
make -j4 mp
Prior to compiling, all .mp files (executable, log, and data) will be
removed from the directory. All standard output generated while
creating and then running the new executable will be sent to the run
log file. As when building only, the executable created will reside in
the /bin directory within CODE_DIR and a copy will be placed in the
run directory with name geos.mp. Your compile settings will be
stored in log lastbuild.mp as well as the run log file. Output data
files are appended with the .mp suffix except for restart and
HEMCO diagnostic files that do not correspond to the simulation
end date. Finally, your compile and run settings will be printed to the
terminal window.
The GNU Make utility is a powerful and flexible software
development tool and we encourage you to explore the existing
Makefile options and to customize the Makefile to optimize your
own needs.
--Lizzie Lundgren 10:59, 22 April 2015 (EDT)
Information from the last time GEOS-Chem was
compiled
When compiling GEOS-Chem v11-01 from one of the run directory
Makefiles, a file called lastbuild.mp or lastbuild.sp will be
created. This file contains the settings that were used to compile
GEOS-Chem, for your reference.
#!/bin/tcsh -f # Script
definition line
cd Code.v9-02 # your code
dir
rm -f log # clear log
file
make -j4 MET=geos5 GRID=4x5 > log # build the
code
exit(0) # exit
normally
You can then run this script interactively, or submit it to the queue
system on your computational cluster.
Technical Issues
GNU Make is required
The makefiles described above are designed to be used with GNU
Make. This is free, open-source software and is bundled with most
of the popular Unix versions today (e.g. various Linux builds,
Ubuntu, Fedora, etc.).
It is recommended to use GNU Make because not all Unix make
utilities are compatible with each other. GNU Make is probably the
most flexible make utility available.
If GNU Make is not already installed on your system, then you (or
your IT guru) will have to install it. You can check to see if GNU
Make is already installed by typing:
make --version
at the Unix prompt. GNU Make will display a version screen similar
to this:
Please also read the GNU Make Reference Document for more
detailed information.
Install LaTeX utilities for auto documentation
Many of the updated GEOS-Chem source code files now use
the ProTeX documentation headers. The protex script (included in
the Code/doc subdirectory) strips the information from the
documentation headers located in GEOS-Chem
Modules
Subroutines
Functions
Include files
and creates a LaTeX format (*.tex) file. The latex utility can
then be used to produce output in PDF (*.pdf) and PostScript
(*.ps) formats from the LaTeX file. This all happens automatically
when you type:
make doc
at the Unix prompt. However, you must make sure to have the
LaTeX utilities installed on your machine in order to be able to
generate this automatic documentation. The LaTeX utilities include:
Utility Description
The LaTeX utilities may be packaged with your version of Unix. Ask
your IT guru for more information.
You may also want to install the following packages:
Utility Description
ghostview Reader for PostScript files
Command Description
If you have selected any of the above options, then try compiling
GEOS-Chem from scratch, i.e.,
make realclean
make -j4 MET=____ GRID=____
The issue was that mod dir remained empty and I had compiler had
difficulty finding mod files, the problem was resolved by changing
line 1317 of Makefile_header.mk from pgfortran to pgi.
Image PC Routine
Line Source
libifcoremt.so.5 00002B9EFA2188D3 Unknown
Unknown Unknown
geos.mp 00000000011FCE35
regrid_a2a_mod_mp 1914 regrid_a2a_mod.F90
libiomp5.so 00002B9EFB70A8A3 Unknown
Unknown Unknown
When looking into the core files I get error output such as this:
Have you seen this portion of the code causing a seg fault before?
Bob Yantosca replied:
to:
lib:
# Build all G-C libraries
@$(MAKE) libkpp
@$(MAKE) libutil
@$(MAKE) libiso
@$(MAKE) libcore
References
Obsolete information
These sections pertain to makefile options
for obsolete versions of GEOS-Chem. We
shall keep this information here for
reference.
GEOS-Chem v10-01
The table below lists the directory structure
in GEOS-Chem v10-01 along with
descriptions of each subdirectory and its
Makefile (if one is present).
Is there a
Descriptio
Directory Makefile
n
here?
Makefile
_header.
mk define
s
compilati
on and
linking
comman
ds for the
Fortran-
90
compilers
. These
comman
ds are
common
to the
makefiles
in all
subdirect
ories.
Model
(GTMM) si
mulation
NOTE: Due
to the many
wide-
sweeping
updates (e.g.
HEMCO,
UCX, SOA)
made in
recent
versions of
GEOS-
Chem, the
APM
package is
now no
longer
compatible
with GEOS-
Chem v10-
01. The
APM team
is currently
working on
bringing
APM up to
date.
type
definitions
NOTE:
In
previo
us
version
s,
these
files
were
header
files
that
were
inlined
via
the #in
clude s
tateme
nt.
These
have
since
been
conver
ted to
F90
modul
es in
order
to
facilitat
e Grid-
Indepe
ndent
GEOS-
Chem
develo
pment.
Code/ISO Directory Yes, it
ROPIA
containing compiles the
the code
unmodifie in ISOROPIA a
d ISORROP nd creates
IA II source library
code files file lib/libI
from soropia.a.
Thanos
Nenes and
Havala
Pye:
isorop
iaIIco
de.F
isrpia
.inc
Code/KPP Directory No
/int
containing
the
integrators
(rosenbroc
k, runge-
kutta,
lsodes,
etc.) for
KPP
Code/Ncd Directory No
fUtil/pe
rl containing
perl scripts
from the
NcdfUtiliti
es package
that can
be used to
generate
Fortran
code for
defining,
writing,
and
reading a
netCDF file
Code/bin Directory No
where
executable
(geos, geos
tomas, geo
sapm) files
will be
sent
Code/lib Directory No
where
library
(*.a) files
will be
created
Code/mod Directory No
where
module
(*.mod)
files will be
sent
Code/obs Directory No
olete
where
obsolete
source
code files
are placed
for
reference
if needed
Is there a
Director Descriptio
Makefile
y n
here?
Makefile_
header.mk
defines
compilatio
n and
linking
commands
for the
Fortran-90
compilers.
These
commands
are
common
to the
makefiles
in all
subdirecto
ries.
Code/He Directory Yes, it
aders
containing compiles the
module code
files with in Headers an
fixed d creates
parameter library
s and file lib/libHe
derived- aders.a
type
definitions
NOTE:
In
previo
us
version
s,
these
files
were
header
files
that
were
inlined
via
the #in
clude s
tateme
nt.
These
have
since
been
conver
ted to
F90
modul
es in
order
to
facilitat
e Grid-
Indepe
ndent
GEOS-
Chem
develo
pment.
Code/Ge Directory Yes, it
osUtil
containing compiles the
GEOS- code
Chem in GeosUtil an
utility d creates
modules library
file lib/libGe
osUtil.a
from
Thanos
Nenes and
Havala
Pye:
isorro
piaIIc
ode.F
isrpia
.inc
Code/KP Contains No
P/int
the
integrators
(rosenbroc
k, runge-
kutta,
lsodes,
etc.) for
KPP.
Code/mo Directory No
d
where
module
(*.mod)
files will be
sent
Code/bi Directory No
n
where
executable
(geos) files
will be
sent
Code/ob Directory No
solete
where
obsolete
source
code files
are placed
for future
reference
if need be
NOTE: Starting with GEOS-Chem v9-02,
the GeosTomas subdirectory has been
removed and code for the TOMAS
aerosol microphysics package has been
inlined into the GeosCore directory
using C-preprocessor statements. See
the TOMAS Aerosol Microphysics wiki
page for more information.
--Lizzie Lundgren 13:47, 15 April 2015
(EDT)
Backup files not excluded from
compilation
Yes...I think in some v8-02-xx versions of the code we may have had a wildcard that
grepped for *.f* in the Makefile. This would have tried to compile the *.f~ files that are
generated when you use Emacs to edit the code.
In the GeosCore/Makefile you can replace any lines that look like:
# Source and
object files
SRC =
$(wildcard *.F*)
with
# Source and
object files
SRC =
$(wildcard *.F)
$(wildcard *.F90)
and this should exclude all *.f~ and *.f90~ files from the compilation. This fix has since
been standardized in GEOS-Chem v8-03-01.
--Bob Y. 13:58, 27
August 2010
(EDT)
We have been looking at the ammonia emissions in the model in our comparisons to
observational data from Africa and have a couple of questions.
Although the GEIA anthropogenic emissions are overwritten by other inventories, for
Africa at least, the model still uses the GEIA Natural and Biofuel emission of NH3: