What is a .bin file?
A .bin file is a binary file, meaning it stores raw data (in binary format — 0s and 1s), not plain text.
Unlike text files, you can’t just open it with a text editor and read it — the content is meant to be interpreted by a specific program or according to a known structure.
In other words:
.bin is just a generic extension — it could store anything: numbers, images, sensor readings, or even executable code.
So, before reading a .bin file, you need to know how the data was written (for example: float32, int16, array shape, byte order, etc.).
Binary .bin files are often accompanied by metadata files such as .hdr or .ctl.
You should always read the corresponding header file first—for example, to extract important information like the byte_order.
Depending on the data type, there are a few main options.
Using NumPy — for structured numerical data**
If your .bin file contains a sequence of numbers (like sensor data, images, matrices, etc.):
1 2 3 4 5 6 | import numpy as np # Example: read binary file of 32-bit floats data = np.fromfile('data.bin', dtype=np.float32) print(data[:10]) # show first 10 values |
You can reshape it if you know the intended array shape:
1 | data = data.reshape((100, 200)) # e.g., 100x200 image |
If you’re not sure about the type (float32, int16, etc.), try checking the file’s metadata or documentation.
Using Python’s built-in open()**
For general binary files (unknown or mixed structure):
1 2 | with open('file.bin', 'rb') as f: # 'rb' means "read binary" content = f.read() |
Now content is a bytes object. You can then interpret it with the struct module:
1 2 3 4 5 6 | import struct # Suppose each record is one 32-bit float (4 bytes) floats = struct.iter_unpack('f', content) for value, in floats: print(value) |
Note: Be careful — if the dataset is very large, printing it directly may cause your program to slow down or crash.
The struct format codes ('f', 'i', 'd', etc.) define how to interpret bytes.
Specialized readers**
Some .bin files correspond to known scientific formats:
Satellite data (MODIS, GOES, etc.) — often have .bin products with accompanying metadata files (.hdr, .ctl, etc.) and can be read with:
rasterioxarray+cfgribGDALif it recognizes the format.
Example:
1 2 3 4 | import rasterio with rasterio.open('image.bin') as src: arr = src.read(1) |
If you see an accompanying file like .hdr or .xml, it usually describes the data structure (rows, cols, data type).
Summary
| Case | File Type | How to Read | Library |
|---|---|---|---|
| Raw numbers (float, int) | Sequential binary | np.fromfile() |
NumPy |
| Mixed / custom | Unknown | open(..., 'rb') + struct |
Python built-in |
| Geospatial / remote sensing | With metadata (.hdr, .xml) |
rasterio, xarray, gdal |
GDAL stack |
Real Case Example
As a practical example, consider data from the NOAA RAVE product.
Reference documentation is available here.
One of the datasets used in RAVE—land cover and ecoregion maps—can be downloaded from South Dakota State University. These datasets provide detailed spatial classifications of surface types and ecoregions across North America.
For additional background and classification details, you can consult the Ecoregions of North America dataset provided by the U.S. Environmental Protection Agency (EPA), which defines ecological regions based on climate, vegetation, soil, and land use characteristics.
The presence of a .hdr file usually means the .bin file is in an ENVI raster format (common in remote sensing), or some other binary raster format. The .hdr file contains the metadata describing the binary data: number of rows/columns, data type, byte order, interleave type, etc.
Check the .hdr file
It’s a plain text file. You can open it to see the metadata:
1 2 | with open("ecoregion_ecosystem_pt03degree_NorthAmerica_G2.hdr", "r") as f: print(f.read()) |
You’ll typically see something like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ENVI description = { File Imported into ENVI.} samples = 6240 lines = 2610 bands = 2 header offset = 0 file type = ENVI Standard data type = 2 interleave = bip sensor type = Unknown byte order = 0 wavelength units = Unknown map info = {Geographic Lat/Lon, 1, 1, 144.975, 81.785, 0.03, 0.03,WGS-84} coordinate system string = {GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["Degree",0.017453292519943295]]} |
Key things to note:
- samples → number of columns
- lines → number of rows
- bands → number of layers (1 for grayscale, >1 for multispectral)
- data type → numeric code (ENVI codes, e.g., 4 = 32-bit float)
- interleave → how data is stored (
bsq,bil,bip) - byte order → 0 = little endian, 1 = big endian
Read the .bin file using NumPy
You can map the ENVI data type to NumPy dtype:
| ENVI code | NumPy dtype |
|---|---|
| 1 | np.uint8 |
| 2 | np.int16 |
| 3 | np.int32 |
| 4 | np.float32 |
| 5 | np.float64 |
| 12 | np.uint16 |
Example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | # Parameters from .hdr n_rows = 2610 n_cols = 6240 n_bands = 2 dtype = np.int16 # ENVI data type 2 byte_order = '<' # 0 = little endian # Read the binary file data = np.fromfile('ecoregion_ecosystem_pt03degree_NorthAmerica_G2.bin', dtype=byte_order + 'i2') # Reshape according to BIP interleave data = data.reshape((n_rows, n_cols, n_bands)) print("Data shape:", data.shape) # should be (2610, 6240, 2) print("Data type:", data.dtype) # Example: access first band image1 = data[:, :, 0] image2 = data[:, :, 1] |
Plot the data using matplotlib:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | import matplotlib.pyplot as plt from mpl_toolkits.axes_grid1 import make_axes_locatable # Example visualization fig, ax = plt.subplots(figsize=(8, 6)) im = ax.imshow(image1, cmap='gray') # Add a colorbar that matches the height of the image divider = make_axes_locatable(ax) cax = divider.append_axes("right", size="5%", pad=0.05) plt.colorbar(im, cax=cax) ax.set_title("Image1") plt.tight_layout() plt.savefig("read_bin_file_01.png", dpi=100, bbox_inches='tight') plt.show() |

Optional: Use rasterio for convenience
If the .hdr is recognized, you can also use rasterio:
1 2 3 4 5 | import rasterio with rasterio.open('file.bin') as src: img = src.read(1) # read first band print(img.shape) |
Note: Sometimes rasterio needs the .hdr renamed as .img with the same base name. Otherwise, reading manually with NumPy is safest.
References
| Links | Site |
|---|---|
| https://docs.python.org/3/library/struct.html | Official Python documentation for reading binary data using the struct module |
| https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files | Python tutorial on reading and writing files (including binary mode) |
| https://numpy.org/doc/stable/reference/generated/numpy.fromfile.html | NumPy documentation for reading binary data directly into arrays |
