CoCalc -- low_level_inspection.rst

GitHub Repository: yt-project/yt
Path: blob/main/doc/source/examining/low_level_inspection.rst
⁹²⁸ views
.. _low-level-data-inspection:

Low-Level Data Inspection: Accessing Raw Data
=============================================

yt can not only provide high-level access to data, such as through slices,
projections, object queries and the like, but it can also provide low-level
access to the raw data.

.. note:: This section is tuned for patch- or block-based simulations.  Future
          versions of yt will enable more direct access to particle and oct
          based simulations.  For now, these are represented as patches, with
          the attendant properties.

For a more basic introduction, see :ref:`quickstart` and more specifically
:doc:`../quickstart/2)_Data_Inspection`.

.. _examining-grid-hierarchies:

Examining Grid Hierarchies
--------------------------

yt organizes grids in a hierarchical fashion; a coarser grid that contains (or
overlaps with) a finer grid is referred to as its parent.  yt organizes these
only a single level of refinement at a time.  To access grids, the ``grids``
attribute on a :class:`~yt.geometry.grid_geometry_handler.GridIndex` object.  (For
fast operations, a number of additional arrays prefixed with ``grid`` are also
available, such as ``grid_left_edges`` and so on.)  This returns an instance of
:class:`~yt.data_objects.grid_patch.AMRGridPatch`, which can be queried for
either data or index information.

The :class:`~yt.data_objects.grid_patch.AMRGridPatch` object itself provides
the following attributes:

* ``Children``: a list of grids contained within this one, of one higher level
  of refinement
* ``Parent``: a single object or a list of objects this grid is contained
  within, one level of refinement coarser
* ``child_mask``: a mask of 0's and 1's, representing where no finer data is
  available in refined grids (1) or where this grid is covered by finer regions
  (0).  Note that to get back the final data contained within a grid, one can
  multiple a field by this attribute.
* ``child_indices``: a mask of booleans, where False indicates no finer data
  is available.  This is essentially the inverse of ``child_mask``.
* ``child_index_mask``: a mask of indices into the ``ds.index.grids`` array of the
  child grids.
* ``LeftEdge``: the left edge, in native code coordinates, of this grid
* ``RightEdge``: the right edge, in native code coordinates, of this grid
* ``dds``: the width of a cell in this grid
* ``id``: the id (not necessarily the index) of this grid.  Defined such that
  subtracting the property ``_id_offset`` gives the index into ``ds.index.grids``.
* ``NumberOfParticles``: the number of particles in this grid
* ``OverlappingSiblings``: a list of sibling grids that this grid overlaps
  with.  Likely only defined for Octree-based codes.

In addition, the method
:meth:`~yt.data_objects.grid_patch.AMRGridPatch.get_global_startindex` can be
used to get the integer coordinates of the upper left edge.  These integer
coordinates are defined with respect to the current level; this means that they
are the offset of the left edge, with respect to the left edge of the domain,
divided by the local ``dds``.

To traverse a series of grids, this type of construction can be used:

.. code-block:: python

   g = ds.index.grids[1043]
   g2 = g.Children[1].Children[0]
   print(g2.LeftEdge)

.. _examining-grid-data:

Examining Grid Data
-------------------

Once you have identified a grid you wish to inspect, there are two ways to
examine data.  You can either ask the grid to read the data and pass it to you
as normal, or you can manually intercept the data from the IO handler and
examine it before it has been unit converted.  This allows for much more raw
data inspection.

To access data that has been read in the typical fashion and unit-converted as
normal, you can access the grid as you would a normal object:

.. code-block:: python

   g = ds.index.grids[1043]
   print(g["gas", "density"])
   print(g["gas", "density"].min())

To access the raw data (as found in the file), use

.. code-block:: python

   g = ds.index.grids[1043]
   rho = g["gas", "density"].in_base("code")

.. _finding-data-at-fixed-points:

Finding Data at Fixed Points
----------------------------

One of the most common questions asked of data is, what is the value *at this
specific point*.  While there are several ways to find out the answer to this
question, a few helper routines are provided as well.  To identify the
finest-resolution (i.e., most canonical) data at a given point, use
the point data object::

  from yt.units import kpc
  point_obj = ds.point([30, 75, 80]*kpc)
  density_at_point = point_obj['gas', 'density']

The point data object works just like any other yt data object. It is special
because it is the only zero-dimensional data object: it will only return data at
the exact point specified when creating the point data object. For more
information about yt data objects, see :ref:`Data-objects`.

If you need to find field values at many points, the
:meth:`~yt.data_objects.static_output.Dataset.find_field_values_at_points`
function may be more efficient. This function returns a nested list of field
values at multiple points in the simulation volume. For example, if one wanted
to find the value of a mesh field at the location of the particles in a
simulation, one could do::

  ad = ds.all_data()
  ppos = ad["all", "particle_position"]
  ppos_den_vel = ds.find_field_values_at_points(
    [("gas", "density"), ("gas", "velocity_x")],
    ppos
  )

In this example, ``ppos_den_vel`` will be a list of arrays. The first array will
contain the density values at the particle positions, the second will contain
the x velocity values at the particle positions.

.. _examining-grid-data-in-a-fixed-resolution-array:

Examining Grid Data in a Fixed Resolution Array
-----------------------------------------------

If you have a dataset, either AMR or single resolution, and you want to just
stick it into a fixed resolution numpy array for later examination, then you
want to use a :ref:`Covering Grid <available-objects>`.  You must specify the
maximum level at which to sample the data, a left edge of the data where you
will start, and the resolution at which you want to sample.

For example, let's use the :ref:`sample dataset <getting-sample-data>`
``Enzo_64``.  This dataset is at a resolution of 64^3 with 5 levels of AMR,
so if we want a 64^3 array covering the entire volume and sampling just the
lowest level data, we run:

.. code-block:: python

   import yt

   ds = yt.load("Enzo_64/DD0043/data0043")
   all_data_level_0 = ds.covering_grid(level=0, left_edge=[0, 0.0, 0.0], dims=[64, 64, 64])

Note that we can also get the same result and rely on the dataset to know
its own underlying dimensions:

.. code-block:: python

   all_data_level_0 = ds.covering_grid(
       level=0, left_edge=[0, 0.0, 0.0], dims=ds.domain_dimensions
   )

We can now access our underlying data at the lowest level by specifying what
:ref:`field <field-list>` we want to examine:

.. code-block:: python

  print(all_data_level_0["gas", "density"].shape)
  # (64, 64, 64)

  print(all_data_level_0["gas", "density"])
  # array([[[  1.92588925e-31,   1.74647692e-31,   2.54787518e-31, ...,

  print(all_data_level_0["gas", "temperature"].shape)
  # (64, 64, 64)

If you create a covering grid that spans two child grids of a single parent
grid, it will fill those zones covered by a zone of a child grid with the
data from that child grid. Where it is covered only by the parent grid, the
cells from the parent grid will be duplicated (appropriately) to fill the
covering grid.

Let's say we now want to look at that entire data volume and sample it at
a higher resolution (i.e. level 2).  As stated above, we'll be oversampling
under-refined regions, but that's OK.  We must also increase the resolution
of our output array by a factor of 2^2 in each direction to hold this new
larger dataset:

.. code-block:: python

   all_data_level_2 = ds.covering_grid(
       level=2, left_edge=[0, 0.0, 0.0], dims=ds.domain_dimensions * 2**2
   )

And let's see what's the density in the central location:

.. code-block:: python

   print(all_data_level_2["gas", "density"].shape)
   (256, 256, 256)

   print(all_data_level_2["gas", "density"][128, 128, 128])
   1.7747457571203124e-31

There are two different types of covering grids: unsmoothed and smoothed.
Smoothed grids will be filled through a cascading interpolation process;
they will be filled at level 0, interpolated to level 1, filled at level 1,
interpolated to level 2, filled at level 2, etc. This will help to reduce
edge effects. Unsmoothed covering grids will not be interpolated, but rather
values will be duplicated multiple times.

To sample our dataset from above with a smoothed covering grid in order
to reduce edge effects, it is a nearly identical process:

.. code-block:: python

   all_data_level_2_s = ds.smoothed_covering_grid(
       2, [0.0, 0.0, 0.0], ds.domain_dimensions * 2**2
   )

   print(all_data_level_2_s["gas", "density"].shape)
   (256, 256, 256)

   print(all_data_level_2_s["gas", "density"][128, 128, 128])
   1.763744852165591e-31


Covering grids can also accept a ``data_source`` argument, in which case only
the cells of the covering grid that are contained by the ``data_source`` will be
filled. This can be useful to create regularized arrays of more complex
geometries. For example, if we provide a sphere, we see that the covering grid
shape is the same, but the number of cells with data is less

.. code-block:: python

   sp = ds.sphere(ds.domain_center, (0.25, "code_length"))
   cg_sp = ds.covering_grid(
       level=0, left_edge=[0, 0.0, 0.0], dims=ds.domain_dimensions, data_source=sp
   )

   print(cg_sp["gas", "density"].shape)
   (64, 64, 64)

   print(cg_sp["gas", "density"].size)
   262144

   print(cg_sp["gas", "density"][cg_sp["gas", "density"] != 0].size)
   17256

The ``data_source`` can be any :ref:`3D Data Container <region-reference>`. Also
note that the ``data_source`` argument is only available for the ``covering_grid``
at present (not the ``smoothed_covering_grid``).

.. _examining-image-data-in-a-fixed-resolution-array:

Examining Image Data in a Fixed Resolution Array
------------------------------------------------

In the same way that one can sample a multi-resolution 3D dataset by placing
it into a fixed resolution 3D array as a
:ref:`Covering Grid <examining-grid-data-in-a-fixed-resolution-array>`, one can
also access the raw image data that is returned from various yt functions
directly as a fixed resolution array.  This provides a means for bypassing the
yt method for generating plots, and allows the user the freedom to use
whatever interface they wish for displaying and saving their image data.
You can use the :class:`~yt.visualization.fixed_resolution.FixedResolutionBuffer`
to accomplish this as described in :ref:`fixed-resolution-buffers`.

High-level Information about Particles
--------------------------------------

There are a number of high-level helpers attached to ``Dataset`` objects to find
out information about the particles in an output file. First, one can check if
there are any particles in a dataset at all by examining
``ds.particles_exist``. This will be ``True`` for datasets the include particles
and ``False`` otherwise.

One can also see which particle types are available in a dataset. Particle types
that are available in the dataset's on-disk output are known as "raw" particle
types, and they will appear in ``ds.particle_types_raw``. Particle types that
are dynamically defined via a particle filter of a particle union will also
appear in the ``ds.particle_types`` list. If the simulation only has one
particle type on-disk, its name will by ``'io'``. If there is more than one
particle type, the names of the particle types will be inferred from the output
file. For example, Gadget HDF5 files have particle type names like ``PartType0``
and ``PartType1``, while Enzo data, which usually only has one particle type,
will only have a particle named ``io``.

Finally, one can see the number of each particle type by inspecting
``ds.particle_type_counts``. This will be a dictionary mapping the names of
particle types in ``ds.particle_types_raw`` to the number of each particle type
in a simulation output.
Product

Resources

Company