pythonnet: Implicit List conversion breaks backwards compatibility + numpy support

Environment

Computer 1:

Pythonnet version: 2.3, installed via pip I think
Python version: 2.7.12
Operating System: Ubuntu 16.04.2

Computer 2:

Pythonnet version: manual build on master from commit ce14424 (currently 11 commits behind)
Python version: 2.7.12
Operating System: Ubuntu 16.04.2

Details

Description: Python calls .NET function which returns a List<MyType>. Python then passes the return value, without modification, to a second .NET function which accepts a List<MyType>. Computer 1 executes this code just fine. On Computer 2, there is no .NET function found that matches the arguments because the return value of the .NET function has been transformed into a Python list.

Python code:

import clr
from MyDotNetProject import PythonInterop

x = PythonInterop.GetDoubleList()
PythonInterop.PrintDoubleList(x)

.NET code:

    public class PythonInterop
    {
        public static List<Double> GetDoubleList() {
            var result = new List<Double>();
            result.Add(1);
            result.Add(2);
            result.Add(3);
            return result;
        }

        public static void PrintDoubleList(List<Double> list) {
            Console.WriteLine("You list has " + list.Count + " elements");
        }
    }

The Python code works fine on Computer 1. On Computer 2, the PrintDoubleList call produces TypeError: No method matches given arguments

If I print type(x) in Python, Computer 1 gives me a .NET List type while Computer 2 gives me a Python list. I can print x.Count on Computer 1, but I get a missing attribute error on Computer 2.

If I build manually from the 2.3 tag, I get the same (good) behavior as on Computer 1.

It seems that some feature has been partially added to automatically convert .NET objects into Python objects when possible. I suppose this is ok (though I would prefer that not happen because I don’t want the mandatory performance hit of converting even when I don’t want to convert), but if that’s the intention, there must be automatic conversion of Python objects to .NET objects also. One without the other is a critical bug.

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 3
Comments: 36 (18 by maintainers)

Commits related to this issue

added RawProxyEncoder Now Python host can force raw encoding for autoconverted .NET types. Enables workaround for #514 — committed to losttech/pythonnet by lostmsu 4 years ago
added RawProxyEncoder Now Python host can force raw encoding for autoconverted .NET types. Enables workaround for #514 — committed to losttech/pythonnet by lostmsu 4 years ago
added RawProxyEncoder Now Python host can force raw encoding for autoconverted .NET types. Enables workaround for #514 — committed to losttech/pythonnet by lostmsu 4 years ago
Add RawProxyEncoder (#1122) Now Python host can force raw encoding for autoconverted .NET types. Enables workaround for #514 — committed to pythonnet/pythonnet by lostmsu 4 years ago
Add RawProxyEncoder (#1122) Now Python host can force raw encoding for autoconverted .NET types. Enables workaround for #514 — committed to QuantConnect/pythonnet by lostmsu 4 years ago

Most upvoted comments

Hi, I wrote a Python-side Numpy <-> .NET conversion functions that use ctypes.memmove based on what @BenjaminPelletier provided above and the 2014 mailing list conversation between David Cook and Jeffrey Bush. It doesn’t require any C# helper code. I think I’ve got support for all the types in System except strings. Performance seems adequate, I can do around 1 MB / ms, after the initial lazy import:

Numpy to .NET converted 8388608 bytes in 4.231 +/- 1.812 ms (mean: 4.0 ns/ele)
.NET to Numpy converted 8388608 bytes in 3.461 +/- 1.200 ms (mean: 3.3 ns/ele)

The ctypes.memmove approach is equally as fast as Marshal.Copy and doesn’t have any issues with multi-dimensional arrays associated with it. Source code as follows, the __main__ block is all testing code:

import numpy as np
import ctypes
import clr, System
from System import Array, Int32
from System.Runtime.InteropServices import GCHandle, GCHandleType

_MAP_NP_NET = {
    np.dtype('float32'): System.Single,
    np.dtype('float64'): System.Double,
    np.dtype('int8')   : System.SByte,
    np.dtype('int16')  : System.Int16,
    np.dtype('int32')  : System.Int32,
    np.dtype('int64')  : System.Int64,
    np.dtype('uint8')  : System.Byte,
    np.dtype('uint16') : System.UInt16,
    np.dtype('uint32') : System.UInt32,
    np.dtype('uint64') : System.UInt64,
    np.dtype('bool')   : System.Boolean,
}
_MAP_NET_NP = {
    'Single' : np.dtype('float32'),
    'Double' : np.dtype('float64'),
    'SByte'  : np.dtype('int8'),
    'Int16'  : np.dtype('int16'), 
    'Int32'  : np.dtype('int32'),
    'Int64'  : np.dtype('int64'),
    'Byte'   : np.dtype('uint8'),
    'UInt16' : np.dtype('uint16'),
    'UInt32' : np.dtype('uint32'),
    'UInt64' : np.dtype('uint64'),
    'Boolean': np.dtype('bool'),
}

def asNumpyArray(netArray):
    '''
    Given a CLR `System.Array` returns a `numpy.ndarray`.  See _MAP_NET_NP for 
    the mapping of CLR types to Numpy dtypes.
    '''
    dims = np.empty(netArray.Rank, dtype=int)
    for I in range(netArray.Rank):
        dims[I] = netArray.GetLength(I)
    netType = netArray.GetType().GetElementType().Name

    try:
        npArray = np.empty(dims, order='C', dtype=_MAP_NET_NP[netType])
    except KeyError:
        raise NotImplementedError("asNumpyArray does not yet support System type {}".format(netType) )

    try: # Memmove 
        sourceHandle = GCHandle.Alloc(netArray, GCHandleType.Pinned)
        sourcePtr = sourceHandle.AddrOfPinnedObject().ToInt64()
        destPtr = npArray.__array_interface__['data'][0]
        ctypes.memmove(destPtr, sourcePtr, npArray.nbytes)
    finally:
        if sourceHandle.IsAllocated: sourceHandle.Free()
    return npArray

def asNetArray(npArray):
    '''
    Given a `numpy.ndarray` returns a CLR `System.Array`.  See _MAP_NP_NET for 
    the mapping of Numpy dtypes to CLR types.

    Note: `complex64` and `complex128` arrays are converted to `float32` 
    and `float64` arrays respectively with shape [m,n,...] -> [m,n,...,2]
    '''
    dims = npArray.shape
    dtype = npArray.dtype
    # For complex arrays, we must make a view of the array as its corresponding 
    # float type.
    if dtype == np.complex64:
        dtype = np.dtype('float32')
        dims.append(2)
        npArray = npArray.view(np.float32).reshape(dims)
    elif dtype == np.complex128:
        dtype = np.dtype('float64')
        dims.append(2)
        npArray = npArray.view(np.float64).reshape(dims)

    netDims = Array.CreateInstance(Int32, npArray.ndim)
    for I in range(npArray.ndim):
        netDims[I] = Int32(dims[I])
    
    if not npArray.flags.c_contiguous:
        npArray = npArray.copy(order='C')
    assert npArray.flags.c_contiguous

    try:
        netArray = Array.CreateInstance(_MAP_NP_NET[dtype], netDims)
    except KeyError:
        raise NotImplementedError("asNetArray does not yet support dtype {}".format(dtype))

    try: # Memmove 
        destHandle = GCHandle.Alloc(netArray, GCHandleType.Pinned)
        sourcePtr = npArray.__array_interface__['data'][0]
        destPtr = destHandle.AddrOfPinnedObject().ToInt64()
        ctypes.memmove(destPtr, sourcePtr, npArray.nbytes)
    finally:
        if destHandle.IsAllocated: destHandle.Free()
    return netArray

if __name__ == '__main__':
    from time import perf_counter
    import matplotlib.pyplot as plt
    import psutil

    tries = 1000
    foo = np.full([1024,1024], 2.5, dtype='float32')


    netMem = np.zeros(tries)
    t_asNet = np.zeros(tries)
    netFoo = asNetArray( foo ) # Lazy loading makes the first iteration very slow
    for I in range(tries):
        t0 = perf_counter()
        netFoo = asNetArray( foo )
        t_asNet[I] = perf_counter() - t0
        netMem[I] = psutil.virtual_memory().free / 2.0**20

    t_asNumpy = np.zeros(tries)
    numpyMem = np.zeros(tries)
    unNetFoo = asNumpyArray( netFoo ) # Lazy loading makes the first iteration very slow
    for I in range(tries):
        t0 = perf_counter()
        unNetFoo = asNumpyArray( netFoo )
        t_asNumpy[I] = perf_counter() - t0
        numpyMem[I] = psutil.virtual_memory().free / 2.0**20

    # Convert times to milliseconds
    t_asNet *= 1000
    t_asNumpy *= 1000
    np.testing.assert_array_almost_equal( unNetFoo, foo )
    print( "Numpy to .NET converted {} bytes in {:.3f} +/- {:.3f} ms (mean: {:.1f} ns/ele)".format( \
        foo.nbytes, t_asNet.mean(), t_asNet.std(), t_asNet.mean()/foo.size*1e6 ) )
    print( ".NET to Numpy converted {} bytes in {:.3f} +/- {:.3f} ms (mean: {:.1f} ns/ele)".format( \
        foo.nbytes, t_asNumpy.mean(), t_asNumpy.std(), t_asNumpy.mean()/foo.size*1e6 ) )

    plt.figure()
    plt.plot(np.arange(tries), netMem, '-', label='asNetArray')
    plt.plot(np.arange(tries), numpyMem, '-', label='asNumpyArray')
    plt.legend(loc='best')
    plt.ylabel('Free memory (MB)')
    plt.xlabel('Iteration')
    plt.show(block=True)

I don’t see any evidence of memory-leaking.

Edit: one can do a zero-copy with np.frombuffer but then you have a mess of memory manged both by Python’s garbage collector and C#'s garbage collector. If people here know how to deal with references in both GC’s let me know.

robbmcleod on Dec 13, 2017

Calling Python from .NET is outside my use case so I’m not familiar with the logistics doing so. But “just like I’d expect” would be for pythonnet to expose explicit .NET types that map to Python types for the user to instantiate (in .NET) by some means before calling Python with them. These types would probably have helper functions for converting/wrapping from common corresponding .NET types.

FWIW, here’s how I’m marshalling arrays between .NET and NumPy, and it seems like it would have been nice to not have to figure out & write this myself:

Python:

######################## Copy operations #################################

# [Python.NET] Efficient copy of .NET Array to ctypes or numpy array.
# https://mail.python.org/pipermail/pythondotnet/2014-May/001525.html

def copyNumpyToDotNetArray(srcnparray, destdotnetarray):
    ''' Copies the content in a numpy array of any dimensions with dtype=float to a one-dimensional .NET array
    The number of elements in srcnpfloat must not exceed the size of destdotnetdouble
    
    @srcnpfloat: numpy array (any number of dimensions) with dtype=float
    @destdotnetdouble: Pre-allocated .NET Double Array object (via Array.CreateInstance(Double, n) or similar) to copy the numpy array content into
    '''
    if len(srcnparray.shape) == 1 and not srcnparray.dtype == bool:
        ptr = IntPtr.__overloads__[long](srcnparray.__array_interface__['data'][0])
        Marshal.Copy(ptr, destdotnetarray, 0, srcnparray.size)
    else:
        if not srcnparray.flags.c_contiguous:
            srcnparray = srcnparray.copy(order='C')
        assert srcnparray.flags.c_contiguous
        PythonInterop.CopyFromPointer(IntPtr.__overloads__[long](srcnparray.__array_interface__['data'][0]), destdotnetarray)

def copyDotNetArrayToNumpy(srcdotnetarray, destnparray):
    ''' Copies the content of a one-dimensional .NET array into a numpy array of any dimensions
    The number of elements in srcdotnetdouble and destnpfloat must match
    
    @srcdotnetdouble: One-dimensional .NET Array of Doubles (via Array.CreateInstance(Double, n) or similar) containing source data
    @destnpfloat: numpy array (any number of dimensions) to copy the source data into
    
    Use dest = np.empty(desiredShape, dtype=float) to preinitialize if creating a new variable
    '''
    if len(srcdotnetarray) == 1:
        ptr = IntPtr.__overloads__[long](destnparray.__array_interface__['data'][0])
        Marshal.Copy(srcdotnetarray, 0, ptr, srcdotnetarray.Length)
    else:
        PythonInterop.CopyToPointer(srcdotnetarray, IntPtr.__overloads__[long](destnparray.__array_interface__['data'][0]))

######################## Conversions to .NET #############################

def dotnetArrayOf(nparray):
    ''' Creates and returns a .NET Array that mirrors the provided numpy array
    
    @nparray: numpy array
    
    @Returns: .NET Array with element type matching nparray.dtype and identical dimensions with content that matches the provided numpy array
    '''
    dims = nparray.shape
    n = len(dims)
    dims_dn = Array.CreateInstance(Int32, n)
    for i in range(n):
        dims_dn[i] = Int32(dims[i])
    if nparray.dtype == float:
        dn = Array.CreateInstance(Double, dims_dn)
    elif nparray.dtype == int: #NOTE: this check is probably invalid in Python 3.x
        dn = Array.CreateInstance(Int64, dims_dn)
    elif nparray.dtype == np.int32:
        dn = Array.CreateInstance(Int32, dims_dn)
    elif nparray.dtype == np.bool:
        dn = Array.CreateInstance(Boolean, dims_dn)
    else:
        raise NotImplementedError("dotnetArrayOf does not yet support dtype=" + str(nparray.dtype))
    copyNumpyToDotNetArray(nparray, dn)
    return dn

######################## Conversions from .NET ###########################
    
def numpyArrayOf(dotnetarray):
    ''' Creates and returns a numpy Array that mirrors the provided .NET array
    
    @dotnetarray: .NET Array
    
    @Returns: numpy Array with dtype matching dotnetarray's element type, and identical dimensions with content that matches the provided .NET Array
    '''
    dims = np.empty(dotnetarray.Rank, dtype=int)
    for i in range(dotnetarray.Rank):
        dims[i] = dotnetarray.GetLength(i)
    type_dn = dotnetarray.GetType().GetElementType().Name
    if type_dn == 'Double':
        dtype = float
    elif type_dn == 'Int32':
        dtype = np.int32
    elif type_dn == 'Int64':
        dtype = int #NOTE: probably invalid in Python 3.x
    elif type_dn == 'Boolean':
        dtype = np.bool
    else:
        raise NotImplementedError("numpyArrayOf does not yet support .NET Arrays with " + str(type_dn) + " elements")
    nparray = np.empty(dims, dtype=dtype, order='C')
    copyDotNetArrayToNumpy(dotnetarray, nparray)
    return nparray

.NET:

    public class PythonInterop
    {
        /// <summary>
        /// Copies data from a NumPy array to the destination .NET array
        /// </summary>
        /// <param name="pSource">Pointer to a NumPy array to copy from</param>
        /// <param name="dest">.NET array to be copied into</param>
        /// <remarks>This routine handles Boolean arrays in a special way because NumPy arrays have each element occupying 1 byte while .NET has them occupying 4 bytes</remarks>
		public static void CopyFromPointer(IntPtr pSource, Array dest)
        {
            Type elementType = dest.GetType().GetElementType();
            int sizeOfElement = Marshal.SizeOf(elementType);
            if (elementType == typeof(Boolean))
                sizeOfElement = 1;

            int byteCount = sizeOfElement;
            for (int i = 0; i < dest.Rank; i++)
            {
                byteCount *= dest.GetLength(i);
            }

            var gch = GCHandle.Alloc(dest);
            var tPtr = Marshal.UnsafeAddrOfPinnedArrayElement(dest, 0);
            MemCopy(pSource, tPtr, byteCount);
            gch.Free();
        }

        /// <summary>
        /// Copies data from a .NET array to a NumPy array
        /// </summary>
        /// <param name="source">.NET array to be copied from</param>
        /// <param name="pDest">Pointer to a NumPy array to copy into</param>
		public static void CopyToPointer(Array source, IntPtr pDest)
        {
            Type elementType = source.GetType().GetElementType();
            int sizeOfElement = Marshal.SizeOf(elementType);
            if (elementType == typeof(Boolean))
                sizeOfElement = 1;

            int byteCount = sizeOfElement;
            for (int i = 0; i < source.Rank; i++)
                byteCount *= source.GetLength(i);

            var gch = GCHandle.Alloc(source);
            var tPtr = Marshal.UnsafeAddrOfPinnedArrayElement(source, 0);
            MemCopy(tPtr, pDest, byteCount);
            gch.Free();
        }

        private static readonly int SIZEOFINT = Marshal.SizeOf(typeof(int));
        private static unsafe void MemCopy(IntPtr pSource, IntPtr pDest, int byteCount)
        {
            int count = byteCount / SIZEOFINT;
            int rest = byteCount % count;
            unchecked
            {
                int* ps = (int*)pSource.ToPointer(), pd = (int*)pDest.ToPointer();
                // Loop over the cnt in blocks of 4 bytes, 
                // copying an integer (4 bytes) at a time:
                for (int n = 0; n < count; n++)
                {
                    *pd = *ps;
                    pd++;
                    ps++;
                }
                // Complete the copy by moving any bytes that weren't moved in blocks of 4:
                if (rest > 0)
                {
                    byte* ps1 = (byte*)ps;
                    byte* pd1 = (byte*)pd;
                    for (int n = 0; n < rest; n++)
                    {
                        *pd1 = *ps1;
                        pd1++;
                        ps1++;
                    }
                }
            }
        }
    }

BenjaminPelletier on Jul 20, 2017

Yeah, I’m more and more inclined to just deactivate it by default. A runtime switch is possible via the clr pseudo-module or an additional function on Python.Runtime.

filmor on Sep 12, 2019

I have a use-case that is also broken by the #427 change and doesn’t have a good workaround. Consider the following C# class:

namespace MyNamespace
{
    public class MyListContainerClass
    {
        private List<double> _myList = new List<double>();

        public List<double> MyList
        {
            get { return _myList; }
        }
    }
}

Once the List is converted to a python list, any mutations to the python list don’t get updated in the original C# class. You can see this with the following python code:

my_test_class = MyNamespace.MyListContainerClass()
my_list = my_test_class.MyList

# throws exception
# my_list.Add(4)

my_list.append(4)
assert (len(my_list) == 1) # passes

my_list = my_test_class.MyList
assert (len(my_list) == 1) # fails!!!!!

This breaks any case where you would have some other C# method on the class that needs to use the list after it’s been modified by python.

fartsmajeure on Jun 5, 2019

@denfromufa I also have a numpy array converter here https://github.com/yagweb/pythonnetLab/blob/master/pynetLab/Numpy.cs

With this converter, the example https://github.com/pythonnet/pythonnet#example can be replaced with this one, https://github.com/yagweb/pythonnetLab/blob/master/Test/TestMatplotlib.cs This one may be a better an example, because,

.NET users can create a numpy array with a single line, like this, var x = Numpy.NewArray(new double[]{ 1.0, 2.0, 3.0, 4.0, 5.0, 6.0 }); So, No CLR list converting to python list by default is needed.
Usage of PyScope and Matplotlib is also included in this example.

...
scope.Exec(
    "fig = plt.figure() \n" +
    "plt.plot(x[1:], y[1:]) \n"   //we can slice a numpy array
);
var fig = scope.Get("fig");
//fig.show(); //show the figure
plotdata = Matplotlib.SaveFigureToArray(fig, 200, "png"); //save the figure to a .NET array

yagweb on Jul 25, 2017

@ddsleonardo don’t worry, the default conversion is going to be off by default in the next major version.

I was not here around the time Python.NET started, but I suspect this conversion was added to make most APIs easily callable: List<T> does not implement Python collection protocol, and Python’s list does not implement IEnumerable and other stuff.

lostmsu on Oct 30, 2020