Data format for DynamO output

On my site, I host data on the dynamics of hard-sphere fluids obtained using the DynamO particle simulator. Below, I describe how I used DynamO to generate the data.

Simulation parameters

At equilibrium, the monodisperse hard-sphere fluid is described by only two dimensionless parameters: the number of particles

N

and the reduced number density

\rho^* = \sigma^3 N/V = \sigma^3 n

, where

\sigma

is the diameter of a hard sphere,

V

is the system volume,

n

is the particle number density. We are interested in the thermodynamic limit (

N \to \infty

, finite

\rho^*

) and run simulations with periodic boundary conditions for a relatively large number of particles

N

, so that

\rho^*

becomes the only relevant thermodynamic parameter.

However, to actually run the simulation, in addition to

\rho^*

, we need to specify time and length scales: the simulation duration

t_\text{run}

and the size of the domain (which corresponds to

N

). Both need to be large enough to capture temporal and spatial decorrelation of particle velocities. Further, to collect sufficient statistics on free flight times (or, equivalently, collision frequencies), the simulation has to be long enough for each particle to experience several collisions.

Below I discuss the characteristic scales of the problem and suggest values of

t_\text{run}

and

N

sufficient to resolve the statistics of interest. Because I am primarily interested in the gas-like rather than the liquid-like state of the hard-sphere fluid, all computations apply to a sufficiently dilute gas.

All the quantities are studies in reduced units, with

\sigma

being the unit of length and

\sigma \sqrt{m / (k_\text{B} T)}

being the unit of time. The units of velocity is then

\sqrt{k_\text{B} T / m}

Velocity autocorrelation timescale

The simulation duration

t_\text{run}

shall satisfy

t_\text{run} \gg \tau_u

. Via a Green–Kubo relation

\tau_u

is directly related to the self-diffusion coefficient

D

; in reduced units,

\tau_u^* = D^*

In dense fluids,

D^*

and thus

\tau_u^*

are smaller than the dilute-gas estimation above.

Free flight timescales

Second, we discuss the free flight times of particles, based on the article Collisional statistics of the hard-sphere gas by P. Visco, F. van Wijland, and E. Trizac (Phys. Rev. E 2008). The non-dimensional mean free flight time

\langle t_\text{f}^* \rangle

in a dilute gas is given by (needs to be checked)

For small free flight times, their probability density function (PDF)

f(t_\text{t})

is close to the Poisson one,

However, for

t_\text{f} \gg \langle t_\text{f} \rangle

the PDF deviates from the Poisson one, its tail is given by

To resolve statistics on the free flight times we require

N f_{\text{tail}}(t_\text{run}) \ll 1

for each molecule to collide several times during the simulation.

Sound propagation time

Third, we discuss the time

t_c

it takes for a sound wave to propagate through half the length of the simulation box,

L/2 = \frac{1}{2}\sqrt[3] V

. To avoid the influence of the periodic boundary conditions on the gathered time statistics, we should take

t_\text{run} < t_c.

The speed of sound

c

in a dilute hard-sphere gas is

Optimal values for

t_\text{run}

and

N

The goal is to be able to accurately simulate systems with

\rho^*

as low as

0.01

. This corresponds to

N = 3 \, 650 \, 692

. We use a fixed value of

for all systems with

\rho^* \geqslant 0.01

. This value is close to our estimate and is consistent with how DynamO assigns

N

in simulations.

DynamO simulation setup

I describe the three steps using a system at

\rho^* = 0.1

. For subsequent processing and conversion to HDF5, it is assumed that the commands below are executed in a folder named 1e-1, ensuring that the generated data is written to this directory.

Initial configuration file

First, I create the initial configuration file config.start.xml, in which particles are packed into a grid of face-centered cubic (FCC) unit cells (4 particles per cell), with the unit cells arranged in a cubic grid of side length

97

. This results a total of

N = 4 \times 97^3

particles.

The initial configuration file for

\rho^* = 0.1

and

N = 4 \times 97^3

is created with the command:

dynamod --pack-mode 0 \
        --NCells 97 \
        --density 0.1 \
        --i1 0 \
        --out-config-file config.start.xml

Equilibrating the system

Once the initial configuration file was created, we equilibrate the system. For the chosen density,

\rho^* = 0.1

, we have

t_\text{run}^* = 27.5

. We equilibrate for

4t_\text{run} = 110

. Thus, the DynamO command used for equilibration is:

dynarun --config-file config.start.xml \
        --sim-end-time 110 \
        --out-data-file output.xml \
        --out-config-file config.equilibrated.xml

Running the simulation

Now we use the equilibrated state to launch the main simulation for

t_\text{run}^* = 27.5

. To save particle trajectories we launch dynarun with the Trajectory output plugin:

dynarun --config-file config.equilibrated.xml \
        --sim-end-time 27.5 \
        --load-plugin Trajectory \
        --load-plugin MSD \
        --unwrapped \
        --out-data-file output.xml \
        --out-config-file config.end.xml

Once the simulation is finished, you can find the detailed particle trajectories in the trajectory.out file. However, the Trajectory plugin, at the time of writing, is more of a diagnostic rather than investigatory tool. For it to output the desired data, I had to modify it, as I describe below.

Outputting particle trajectories

I output trajectories data as a CSV (technically, space-separated values) file with the following columns

Column name	Description
event	collision id
t	collision time
dt	time since last collision/simulation start
p1	first particle id
p2	second particle id
x1	first particle position (x-component)
y1	first particle position (y-component)
z1	first particle position (z-component)
x2	second particle position (x-component)
y2	second particle position (y-component)
z2	second particle position (z-component)
u1	first particle velocity (x-component)
v1	first particle velocity (y-component)
w1	first particle velocity (z-component)
u2	second particle velocity (x-component)
v2	second particle velocity (y-component)
w2	second particle velocity (z-component)

To achieve this output from DynamO, I had to modify the Trajectory output plugin located in DynamO/src/dynamo/dynamo/outputplugins/trajectory.cpp. Specifically, I modified two methods, OPTrajectory::initialise and OPTrajectory::eventUpdate as follows:

void OPTrajectory::initialise() {
  if (logfile.is_open())
    logfile.close();

  logfile.open("trajectory.out", std::ios::out | std::ios::trunc);
  logfile.precision(std::numeric_limits<double>::max_digits10);
  logfile.setf(std::ios::scientific);

  logfile << "event "
          << "t "
          << "dt "
          << "p1 "
          << "p2 "
          << "x1 "
          << "y1 "
          << "z1 "
          << "x2 "
          << "y2 "
          << "z2 "
          << "u1 "
          << "v1 "
          << "w1 "
          << "u2 "
          << "v2 "
          << "w2\n";
}

void OPTrajectory::eventUpdate(const Event &eevent, const NEventData &SDat) {
  logfile << std::setw(12) << std::setfill('0') << Sim->eventCount // event
          << " " << Sim->systemTime / Sim->units.unitTime()        // t
          << " " << eevent._dt / Sim->units.unitTime();            // dt


  for (const PairEventData &pData : SDat.L2partChanges) {
    const size_t id1 = std::min(pData.particle1_.getParticleID(),
                                pData.particle2_.getParticleID());
    const size_t id2 = std::max(pData.particle1_.getParticleID(),
                                pData.particle2_.getParticleID());

    Vector r1 = Sim->particles[id1].getPosition() / Sim->units.unitLength();
    Vector r2 = Sim->particles[id2].getPosition() / Sim->units.unitLength();

    Vector v1 = Sim->particles[id1].getVelocity() / Sim->units.unitVelocity();
    Vector v2 = Sim->particles[id2].getVelocity() / Sim->units.unitVelocity();

    logfile << " " << id1    // p1
            << " " << id2    // p2
            << " " << r1[0]  // x1
            << " " << r1[1]  // y1
            << " " << r1[2]  // z1
            << " " << r2[0]  // x2
            << " " << r2[1]  // y2
            << " " << r2[2]  // z2
            << " " << v1[0]  // u1
            << " " << v1[1]  // v1
            << " " << v1[2]  // w1
            << " " << v2[0]  // u2
            << " " << v2[1]  // v2
            << " " << v2[2]; // w2
  }
  logfile << "\n";
}

Converting DynamO output to HDF5

Once the simulation is finished I convert the trajectory data to the HDF5 format structured as described preivously.

Assuming that the data was written into the folder 1e-1, the following code creates hsf_1e-1.h5 with processed data:

using HDF5
using CSV
using DataFrames
using EzXML
using Statistics
using LinearAlgebra
using StaticArrays
using ProgressMeter


function get_config_x_u(config_file, N)
  x = Vector{SVector{3, Float64}}(undef, N)
  u = Vector{SVector{3, Float64}}(undef, N)

  config_xml = EzXML.readxml(config_file)
  particles = findall("//Pt", config_xml)
  for pt in particles
    i = parse(Int64, pt["ID"]) + 1
    x_node = findfirst("P", pt)
    u_node = findfirst("V", pt)
    x[i] = SA[parse(Float64, x_node["x"]),
              parse(Float64, x_node["y"]),
              parse(Float64, x_node["z"])]
    u[i] = SA[parse(Float64, u_node["x"]),
              parse(Float64, u_node["y"]),
              parse(Float64, u_node["z"])]
  end

  return x, u
end


function get_parameters(output_file)
  output_xml = EzXML.readxml(output_file)
  ρ_node = findfirst("//Density", output_xml)
  N_node = findfirst("//ParticleCount", output_xml)
  duration_node = findfirst("//Duration", output_xml)
  size_node = findfirst("//PrimaryImageSimulationSize", output_xml)
  
  ρ = parse(Float64, ρ_node["val"])
  N = parse(Int64, N_node["val"])
  L = parse(Float64, size_node["x"])
  t_end = parse(Float64, duration_node["Time"])
  
  return ρ, N, L, t_end
end


function check_consistency(t_trajectory, β_trajectory, x_trajectory, u_trajectory)
  ################################################################################
  println("\nChecking position-velocity consistency...")
  
  x_from_u = [vcat( [x[1]], [x[1]] .+ cumsum(u[1:end-1] .* diff(t)) )
    for (t, x, u) in zip(t_trajectory, x_trajectory, u_trajectory)]
  
  Δx = 0.0
  for (x₁, x₂) in collect(zip(x_trajectory, x_from_u))
    Δx += mean(norm.(x₁ .- x₂))
  end
  println("Δx = ", Δx / length(x_trajectory), " (should be ≈ +0)")

  ################################################################################
  println("\nChecking velocity-position consistency...")
  
  u_from_x = [diff(x) ./ diff(t) for (t, x) in zip(t_trajectory, x_trajectory)]
  for u in u_from_x push!(u, u[end]) end
  
  Δu = []
  for (u₁, u₂) in zip(u_trajectory, u_from_x)
    push!(Δu, mean(norm.(u₁ .- u₂)))
  end
  println("Δu = ", mean(Δu), " (should be ≈ +0)\n")
  
end


function process_DynamO_data(folder_name)
  _, N, _, t_end = get_parameters(folder_name * "/output.xml")

  println("Reading initial and final configurations...")
  x_start, u_start = get_config_x_u(folder_name * "/config.equilibrated.xml", N)
  x_end,   u_end   = get_config_x_u(folder_name * "/config.end.xml", N) 

  println("Reading trajectory data...")
  trajectory_df  = CSV.read(folder_name * "/trajectory.out", DataFrame, delim = ' ')

  println("Processing trajectory data...")
  
  t_trajectory = [[0.0] for _ in 1:N]
  x_trajectory = [[x] for x in x_start]
  u_trajectory = [[u] for u in u_start]
  β_trajectory = [Array{Int64, 1}([]) for _ in 1:N]

  @showprogress for r in eachrow(trajectory_df)
    push!(t_trajectory[r["p1"] + 1], r["t"])
    push!(t_trajectory[r["p2"] + 1], r["t"])

    push!(β_trajectory[r["p1"] + 1], r["p2"] + 1)
    push!(β_trajectory[r["p2"] + 1], r["p1"] + 1)
    
    push!(x_trajectory[r["p1"] + 1], SA[r["x1"], r["y1"], r["z1"]])
    push!(x_trajectory[r["p2"] + 1], SA[r["x2"], r["y2"], r["z2"]])

    push!(u_trajectory[r["p1"] + 1], SA[r["u1"], r["v1"], r["w1"]])
    push!(u_trajectory[r["p2"] + 1], SA[r["u2"], r["v2"], r["w2"]])
  end

  for i in 1:N
    push!(t_trajectory[i], t_end)
    push!(x_trajectory[i], x_end[i])
    push!(u_trajectory[i], u_end[i])
  end

  check_consistency(t_trajectory, β_trajectory, x_trajectory, u_trajectory)
  
  return t_trajectory, β_trajectory, x_trajectory, u_trajectory
end


function process_and_write_DynamO_data(folder_name)
  ρ, N, L, t_end = get_parameters(folder_name * "/output.xml")
  
  t_trajectory, β_trajectory, x_trajectory, _ = process_DynamO_data(folder_name)

  file_name = folder_name * "/hsf_" * folder_name * ".h5"
  println("Writing trajectory data into " * file_name * "...")
  h5open(file_name, "w") do f
    f["t_start"] = 0.0
    f["t_end"]   = t_end
    f["ρ"] = ρ
    f["N"] = N
    f["L"] = L
    f["README"] = "This file contains particle trajectories from the simulation \
    of a hard-sphere fluid at thermal equilibrium. The simulation was performed \
    at reduced density ρ with N particles in a periodic cubic domain of side L, \
    using the DynamO particle simulator; for details, see https://sarnitsky.se/."

    i_to_str(i) = lpad(i, 9, '0')
    
    println("Wrtiting t data...")
    t_data = create_group(f, "t")
    @showprogress for (i, t) in enumerate(t_trajectory)
      t_data[i_to_str(i)] = t
    end

    println("Wrtiting x data...")
    x_data = create_group(f, "x")
    @showprogress for (i, x) in enumerate(x_trajectory)
      x_data[i_to_str(i)] = collect(reinterpret(reshape, Float64, x))
    end

    println("Wrtiting collision data...")
    β_data = create_group(f, "collision_partner")
    @showprogress for (i, β) in enumerate(β_trajectory)
      β_data[i_to_str(i)] = β
    end
    
  end
end

#@time ttr, βtr, xtr, utr = process_DynamO_data("test");

process_and_write_DynamO_data("1e-1")

How I used DynamO particle simulator