On my site, I host data on particle motion, including the motion of molecules in gases and liquids, as well as the motion of Lagrangian particles in turbulence. Most of these systems, like turbulence, exhibit intrinsically smooth time dynamics and can be simulated using conventional time-stepping algorithms such as direct numerical simulation (DNS). However, I also host data on the dynamics of hard-sphere fluids — statistical systems with intrinsically discrete, collision-based dynamics modeled using event-driven algorithms. Such data requires as different storage format, compared to what I used for the smooth particle dynamics.
The data for hard-sphere systems is obtained using the DynamO particle simulator, see respective page for the details. DynamO outputs data in XML format and is designed to provide processed data. I, on the other hand, offer raw trajectory data for users to process themselves. I consider HDF5 to be a more convenient format for this purpose. Below, I describe how I store DynamO output in HDF5.
Each HDF5 file contains particle trajectories from a simulation of a hard-sphere fluid at thermal equilibrium and includes the following objects:
Object name | Type | Description |
---|---|---|
t_start |
Float64 | Start time of the simulation (typically 0). |
t_end |
Float64 | End time of the simulation. |
ρ |
Float64 | Reduced density of the system. |
N |
Int64 | Number of particles in the system. |
L |
Float64 | Side length of the periodic cubic simulation domain, . |
README |
String | General information and reference to this website. |
t |
Group | Collision times for each particle. |
x |
Group | Positions of each particle at the collision times. |
collision_partner |
Group | Indexes of the particles with which each given particle collides. |
All the data is stored in reduced units, with the diameter of a hard sphere being the unit of length, and being the unit of time. Here is the mass of a hard sphere, is the temperature and is the Boltzmann constant. The unit of velocity is then .
The trajectories of the particles are stored in two groups, t
and x
. Each of them contain arrays, one for each particle, named with the particle’s index, from 000000001
to 00...0N
(names have fixed-length of 9 characters).
Consider the particle with index 42. In t/000000042
you will find an array of timestamps (1D Float64), starting with the simulation start time and ending with the simulation end time , and in between containing the times the 42nd particle experienced a collision:
In x/000000042
you will find the position of the 42nd particle sampled at these timestamps:
In the HDF5 file, x/000000042
is stored not as a array of 3D vectors, but as a 2D Float64 array of size . Note that reading it in Julia will give you an array of size .
The velocity of a particle is a piecewise constant function of time; between two consecutive collisions, say and , it can be computed as
Note that the data were obtained from numerical simulations with periodic boundary conditions. That means that the actual position of a particle in a simulation is a discontinuous function of time and the above formula for the particle’s velocity would not work. Therefore, instead of , I store the time integral of the particle’s velocity , i.e.
The actual coordinate can be obtained by applying the periodic boundary conditions to as
At least I think so, I never tried to compute .
Finally, for completeness I store the collision data in the group collision_partner
. For our 42nd particle collision_partner/000000042
contains a 1D Int64 array of particle indexes, these are the particles the 42nd particle collided with at the corresponding collision times t/000000042
.
Below is the Julia code to read the file hsf_1e-2.0.h5
into a dictionary hsf_data
. Note that the t
group of the HDF5 file is read into hsf_data["t"]
as an array of arrays. Similarly, x
is read into hsf_data["x"]
as an array of arrays, each containing static 3-element vectors representing the particle positions.
using HDF5
using Statistics
using ProgressMeter
using StaticArrays
function read_data(file_name)
d = Dict()
i_to_str(i) = lpad(i, 9, '0')
h5open(file_name, "r") do f
for dataset in ["t_start", "t_end", "ρ", "N", "L", "README"]
d[dataset] = read(f, dataset)
end
println("Reading t data...")
d["t"] = Array{Array{Float64, 1}, 1}(undef, d["N"])
@showprogress for i in eachindex(d["t"])
d["t"][i] = read(f, "t/" * i_to_str(i))
end
println("Reading x data...")
d["x"] = Array{Array{SVector{3, Float64}, 1}, 1}(undef, d["N"])
@showprogress for i in eachindex(d["x"])
d["x"][i] = reinterpret(SVector{3,Float64}, vec(read(f, "x/" * i_to_str(i))))
end
end
return d
end
hsf_data = read_data("hsf_1e-2.0.h5");