title: NumPy & Scientific Computing sidebar_position: 8 id: python_numpy_integration license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to You under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Fory natively supports numpy arrays with optimized serialization.
Large arrays use zero-copy when possible:
import pyfory import numpy as np f = pyfory.Fory() # Numpy arrays are supported natively arrays = { 'matrix': np.random.rand(1000, 1000), 'vector': np.arange(10000), 'bool_mask': np.random.choice([True, False], size=5000) } data = f.serialize(arrays) result = f.deserialize(data) # Zero-copy for compatible array types assert np.array_equal(arrays['matrix'], result['matrix'])
Fory can serialize Pandas DataFrames efficiently:
import pyfory import pandas as pd import numpy as np f = pyfory.Fory(xlang=False, ref=False, strict=False) df = pd.DataFrame({ 'a': np.arange(1000, dtype=np.float64), 'b': np.arange(1000, dtype=np.int64), 'c': ['text'] * 1000 }) data = f.serialize(df) result = f.deserialize(data) assert df.equals(result)
For maximum performance with large arrays, use out-of-band serialization:
import pyfory import numpy as np f = pyfory.Fory(xlang=False, ref=False, strict=False) # Large array array = np.random.rand(10000, 1000) # Out-of-band for zero-copy buffer_objects = [] data = f.serialize(array, buffer_callback=buffer_objects.append) buffers = [obj.getbuffer() for obj in buffer_objects] result = f.deserialize(data, buffers=buffers) assert np.array_equal(array, result)
np.ndarray (all dtypes)np.matrix