VRAM Tracking
The native library can log CUDA memory allocation activity to a CSV file.
It intercepts:
cudaMalloccudaFreecuMemAlloccuMemFree
Tracking is enabled by default when the native library is loaded.
Log File Location
Set VAJRA_VRAM_LOG to choose the output path:
VAJRA_VRAM_LOG=/tmp/my_run.csv python your_script.py
If VAJRA_VRAM_LOG is not set, the default path is:
/tmp/vajra_vram_allocs.csv
The CSV columns are:
| Column | Meaning |
|---|---|
TIMESTAMP | Unix timestamp for the allocation event. |
API | CUDA API that was intercepted. |
PTR | Device pointer. |
SIZE | Allocation size in bytes. Frees use 0. |
BACKTRACE | Native call stack frames separated by semicolons. |
Pause and Resume Tracking
Use the static helpers on VajraStreamer:
from vajra import VajraStreamer
VajraStreamer.pause_vram_tracking()
# Allocations here are not logged.
VajraStreamer.resume_vram_tracking()
This is useful when you want to exclude known allocations from the log.
What Could Go Wrong
The tracker hooks CUDA allocation functions at the native library level. If another library also hooks or wraps CUDA allocation APIs, logs can be incomplete or harder to interpret.
The log records allocation activity, not high-level ownership. Use the BACKTRACE column to infer whether an allocation came from model weights, temporary buffers, or other CUDA work.