Integrating Pytave and Nnet

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Integrating Pytave and Nnet

enricobertino

Hi all,

in recent weeks I have been working on the package nnet, analyzing different ways to integrate Tensorflow in order to develop the deep learning part in a more powerful manner. The most effective way seems to be using Pytave for direct calling python functions and scripts. There should be no big problems for objects mapping since neural networks treat mostly low dimensional arrays. This week I will test some cases with 4D arrays and basic networks. 

Only two questions:
I heard that the package would have to be integrated into the core removing the dependence from boost, is it still the case?
I saw that there is no documentation for the integration of python in octave (but only for the other "direction"). I suppose that there is some work in progress, is it possible to have some information on what will be implemented?

In any case thank you to the maintainers, I find the package quite useful! I'll be glad to report bugs or help to fix if needed

Best Regards,
Enrico Bertino

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Abhinav Tripathi


On Mar 21, 2017 3:48 AM, "Enrico Bertino" <[hidden email]> wrote:
>
> Hi all,
>
> in recent weeks I have been working on the package nnet, analyzing different ways to integrate Tensorflow in order to develop the deep learning part in a more powerful manner. The most effective way seems to be using Pytave for direct calling python functions and scripts. There should be no big problems for objects mapping since neural networks treat mostly low dimensional arrays. This week I will test some cases with 4D arrays and basic networks. 
>
> Only two questions:
> I heard that the package would have to be integrated into the core removing the dependence from boost, is it still the case?

Yes. It is planned to remove the boost dependency and also to integrate into core.

> I saw that there is no documentation for the integration of python in octave (but only for the other "direction"). I suppose that there is some work in progress, is it possible to have some information on what will be implemented?
>

Well, there isn't enough documentation but I think that calling octave from python is not tested for some time now. While you can easily call python from octave using 'pycall' or 'pyeval'. New syntaxes using 'py.*' are also supported. Just download the pytave source, build it, add it in octave's path and those functions will be available. You can see the tests in the source code to get to know how to use the functions.

> In any case thank you to the maintainers, I find the package quite useful! I'll be glad to report bugs or help to fix if needed
>

That would be great!

> Best Regards,
> Enrico Bertino
>

Regards,
Abhinav

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Colin Macdonald-2
On 20/03/17 11:00 PM, Abhinav Tripathi wrote:
> I think that calling octave from python is not tested for some time now.

Indeed and I think the plan is to drop code for calling Octave from Python:

https://bitbucket.org/mtmiller/pytave/issues/74/migrate-away-from-remaining-pytave-legacy

Help with this definitely wanted: would be a good component of any
Pytave-related GSoC proposals or as pre-GSoC work.

Colin



Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

enricobertino
In reply to this post by Abhinav Tripathi
Hi all,

I have some questions about how namespaces are treated in Pytave and how modules are imported.

1) If I understand correctly, the global namespace is shared between pyexec and pyeval. Local namespaces can also be used, defining python dictionaries and passing them as arguments to the functions. For example I can run:

 NS = pyeval ("{}");
 pyexec ("import numpy as np", NS)
 pyeval("np.sqrt(2)", NS)

and it works correctly. But what about pycall? The global namespace is shared between pyexec and pycall? Because for example if I define a function with pyexec as

 pyexec (["import numpy as np\n" ...
           "def squareroot(x):\n" ...
           "    s = np.sqrt(x)\n" ...
           "    return s"]);

than I can recall the function as

 pycall ("squareroot", 4)

but I can not call

 pycall ("np.sqrt", 4)

getting the error: "pycall: no such Python function or callable: np.sqrt"

Is there a reason for that?

2) Unfortunately, I am not able to import Tensorflow, that is the module that should be used for the project. I tried to import several others modules, both with pyexec ("import MODULE") and pyeval("__import__('MODULE')") and everything was good. I also tried to implement a basic nnet with an other neural networks module, Theano, and it worked. But when I try

 pyexec("import tensorflow")

I get two different errors, first time

 error: pyexec: AttributeError: 'module' object has no attribute 'argv'

and second time

 error: pyexec: ImportError: Traceback (most recent call last):
 File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 61, in <module>
 from tensorflow.python import pywrap_tensorflow
 ImportError: cannot import name pywrap_tensorflow

that is a weird Tensorflow error. The Tensorflow package is correctly installed as the others modules and I can use it normally in python. Any idea on how this problem could be solved?

Thank you very much!

Enrico

Ps. I'm using Ubuntu 16.10, Octave 4.3.0+ installed from source and Python 2.7.12+
Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Mike Miller-4
On Fri, Mar 31, 2017 at 10:03:26 -0700, enricobertino wrote:

> Hi all,
>
> I have some questions about how namespaces are treated in Pytave and how
> modules are imported.
>
> 1) If I understand correctly, the global namespace is shared between pyexec
> and pyeval. Local namespaces can also be used, defining python dictionaries
> and passing them as arguments to the functions. For example I can run:
>
>  NS = pyeval ("{}");
>  pyexec ("import numpy as np", NS)
>  pyeval("np.sqrt(2)", NS)
>
> and it works correctly. But what about pycall? The global namespace is
> shared between pyexec and pycall? Because for example if I define a function
> with pyexec as
>
>  pyexec (["import numpy as np\n" ...
>            "def squareroot(x):\n" ...
>            "    s = np.sqrt(x)\n" ...
>            "    return s"]);
>
> than I can recall the function as
>
>  pycall ("squareroot", 4)
>
> but I can not call
>
>  pycall ("np.sqrt", 4)
>
> getting the error: "pycall: no such Python function or callable: np.sqrt"
>
> Is there a reason for that?

Hi. First of all, I would suggest that this is not really the preferred
way to be using the Python interface. The goal is to be able to import
Python modules and call Python functions, passing Octave data back and
forth. The goal is not to be able to run arbitrary Python code.

The right (supported) way to call numpy sqrt is

    >> py.numpy.sqrt(2)
    ans =  1.4142
    >> py.numpy.sqrt([1 2 3 4])
    ans =
   
       1.0000   1.4142   1.7321   2.0000
   

> 2) Unfortunately, I am not able to import Tensorflow, that is the module
> that should be used for the project. I tried to import several others
> modules, both with pyexec ("import MODULE") and
> pyeval("__import__('MODULE')") and everything was good. I also tried to
> implement a basic nnet with an other neural networks module, Theano, and it
> worked. But when I try
>
>  pyexec("import tensorflow")
>
> I get two different errors, first time
>
>  error: pyexec: AttributeError: 'module' object has no attribute 'argv'
>
> and second time
>
>  error: pyexec: ImportError: Traceback (most recent call last):
>  File
> "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line
> 61, in <module>
>  from tensorflow.python import pywrap_tensorflow
>  ImportError: cannot import name pywrap_tensorflow
>
> that is a weird Tensorflow error. The Tensorflow package is correctly
> installed as the others modules and I can use it normally in python. Any
> idea on how this problem could be solved?
>
> Thank you very much!
>
> Enrico
>
> Ps. I'm using Ubuntu 16.10, Octave 4.3.0+ installed from source and Python
> 2.7.12+

I'd be happy for you to debug this and fix anything that you think needs
fixing in the way that the Python interface interacts with TensorFlow.
It could be something with the way the TensorFlow modules import
themselves that we are not handling yet.

If it's something like nested namespaces, then there may yet be some
work that needs to be done to handle different ways of specifying
modules and namespaces in Python. I remember running into some
difficulty with Matplotlib, for example, but I was able to work around
it and get it to work.

--
mike

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

enricobertino
Mike Miller-4 wrote
I'd be happy for you to debug this and fix anything that you think needs
fixing in the way that the Python interface interacts with TensorFlow.
It could be something with the way the TensorFlow modules import
themselves that we are not handling yet.

If it's something like nested namespaces, then there may yet be some
work that needs to be done to handle different ways of specifying
modules and namespaces in Python. I remember running into some
difficulty with Matplotlib, for example, but I was able to work around
it and get it to work.

--
mike
Thank you for your answer. Analyzing if there are nested namespaces, I discovered that if I import Pandas before Tensorflow doing simply

 py.__import__("pandas")
 py.__import__("tensorflow")

it works! It is very weird because normally Tensorflow does not require the import of Pandas and it should work independently. By any chance, do you have an idea about what could be the problem? In which part of Pytave should I start to look?

Thank you again,
Enrico
Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Mike Miller-4
On Mon, Apr 03, 2017 at 09:15:21 -0700, enricobertino wrote:
> Thank you for your answer. Analyzing if there are nested namespaces, I
> discovered that if I import Pandas before Tensorflow doing simply
>
>  py.__import__("pandas")
>  py.__import__("tensorflow")
>
> it works! It is very weird because normally Tensorflow does not require the
> import of Pandas and it should work independently.

This does not work for me. I just installed numpy, pandas, and
tensorflow in a new virtualenv, and I still get the original error.

> By any chance, do you
> have an idea about what could be the problem? In which part of Pytave should
> I start to look?

I looked a little bit into tensorflow and discovered that they do have
some tricky things with modules importing the contents of other modules,
plus there is a nested swig library involved.

However, I took a wild guess that the error about `argv` was referring
to `sys.argv`. Sure enough, if I insert a new property into the `sys`
module, I can access tensorflow without the error:

    >> py.sys.__dict__.__setitem__ ("argv", {""});
    >> py.tensorflow.VERSION
    ans = [Python object of type str]
   
      1.0.1

I have a few questions

  * why does `sys.argv` exist when python the interpreter is run, but
    not when it is embedded in another program?
  * are we not doing some important initialization step?
  * should tensorflow not be accessing sys.argv, is this a bug?

Care to help me look into one or more of these?

--
mike

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

enricobertino
Mike Miller-4 wrote
However, I took a wild guess that the error about `argv` was referring
to `sys.argv`. Sure enough, if I insert a new property into the `sys`
module, I can access tensorflow without the error:

    >> py.sys.__dict__.__setitem__ ("argv", {""});
    >> py.tensorflow.VERSION
    ans = [Python object of type str]
   
      1.0.1

I have a few questions

  * why does `sys.argv` exist when python the interpreter is run, but
    not when it is embedded in another program?
  * are we not doing some important initialization step?
  * should tensorflow not be accessing sys.argv, is this a bug?
Yes, argv can be the cause of the issues! Setting manually the argv attribute in sys, I can import tensorflow. I guess that when I import Pandas, that attribute is in a certain way initialized. As you say, probably some initialization steps are missing because tensorflow should access to sys.arg like all other packages.
In next days I will try to analyze both Pytave and Tensorflow module importing and I will write here again!

Thank you for your help!

Cheers,
Enrico
Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

enricobertino
In reply to this post by Mike Miller-4
Mike Miller-4 wrote
I looked a little bit into tensorflow and discovered that they do have
some tricky things with modules importing the contents of other modules,
plus there is a nested swig library involved.
In the last release of Tensorflow (1.0.1), the import of the swig-generated library "_pywrap_tensorflow" was not stable. There was a fix one month ago in [1], and now using the dev version (1.1.0-rc1), I have no import problems anymore. I will try in a different machine too.

Meanwhile, I am doing some tests in order to understand the pyatve interface. I have some doubts about the pyobject but I will ask you when I'll have a better idea on how it works.

Cheers,
Enrico

[1] https://github.com/tensorflow/tensorflow/commit/718812c9e4df55b8b3275aa4db7bb6833ed03111
Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

enricobertino
I wrote an example of image classification calling tensorflow via pytave. Everything works in the right way on my machine, I think it is a good start :) I cloned the existing nnet repo and I added the m files in tests/script [1]. I implemented the standard MNIST example, using both automatic download of the dataset (mnistTensorflow.m) and a manual import of a .mat dataset from local directory (mnist_2k2k.m). I wrote the latter as a script and the former as a function only with the goal of getting familiar with tests and the make check. They call a python class in /tests/src which uses Tensorflow functions.

Regarding pytave, I had some problems with the conversion of the pyobects. In particular I noticed that there is not a conversion of float32 and strings from python to octave. Is this correct or am I doing something wrong?

Cheers,
Enrico

[1] https://bitbucket.org/cittiberto/nnet-enrico 
Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Colin Macdonald-2
On 19/04/17 11:31 AM, enricobertino wrote:
> Regarding pytave, I had some problems with the conversion of the pyobects.
> In particular I noticed that there is not a conversion of float32 and
> strings from python to octave. Is this correct or am I doing something
> wrong?

strings are intentional [1] but they can be cast in a reasonable way:

 >> char (py.str("hi"))

[1]
https://bitbucket.org/mtmiller/pytave/issues/65/do-not-automatically-convert-bytes-str


Re: single precision, it looks like scalars are treated differently than
numpy arrays, which seems like a bug to me:


octave:23> a = py.numpy.float32(py.numpy.random.rand(int32(3),int32(3)))
a =

    0.430638   0.020321   0.589018
    0.604080   0.591453   0.019476
    0.423140   0.134504   0.617556

octave:24> b = py.numpy.float32(py.numpy.pi)
b = [Python object of type numpy.float32]

   3.14159

octave:25> whos
Variables in the current scope:

    Attr Name        Size                     Bytes  Class
    ==== ====        ====                     =====  =====
         a           3x3                         36  single
         b           1x1                          0  pyobject

Total is 10 elements using 36 bytes


Do you want to file a pytave bug for this?



Note we can always cast like this:

 >> single (py.numpy.float32(py.numpy.pi))

But I think this will first make a double intermediate, so also
possibility to improve that.

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Mike Miller-4
In reply to this post by enricobertino
On Wed, Apr 19, 2017 at 11:31:53 -0700, enricobertino wrote:
> Regarding pytave, I had some problems with the conversion of the pyobects.
> In particular I noticed that there is not a conversion of float32 and
> strings from python to octave. Is this correct or am I doing something
> wrong?

This is correct. There's not a lot of documentation yet on implicit vs
explicit conversions, but the intent is to mimic Matlab's conventions.

The following Python types are converted to Octave types implicitly

  bool    -> logical
  int     -> int64 (Python 2 only)
  float   -> double
  complex -> complex double

All others remain as Python types and must be explicitly converted.

To explicitly convert a py.str to Octave string, use char(s). To
explicitly convert a py.numpy.float32 to Octave numeric value, use
double(x) or single(x).

The reason a numpy float64 converts automatically is because it is
derived from Python's builtin float type.

I have a couple additional notes on your scripts, if you don't mind.

You don't need to import modules, just call the fully qualified names
and the interface takes care of importing the module automatically.

For example, just use "py.sys.path.insert(int32(0), "/some/dir")"
instead of "sys = py.__import__(...".

The same goes for any user-defined modules, "py.mnist_class.MNIST()"
works just fine.

Try to avoid using pycall or pyeval directly, since they are not Matlab
compatible functions. You should be able to call Python functions and
classes directly with just the "py." prefix.

For example, just use "py.dict(pyargs(..." instead of using pycall.

You can also use the "list of pairs" dict constructor, purely a matter
of choice. These both construct the same dict:

    py.dict (pyargs ("images", images, "labels", labels))

    py.dict ({{"images", images}, {"labels", labels}})

--
mike

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Enrico Bertino
In reply to this post by enricobertino
Thanks a lot for your answers!

<Mike Miller-4> wrote:
> Try to avoid using pycall or pyeval directly, since they are not Matlab
> compatible functions. You should be able to call Python functions and
> classes directly with just the "py." prefix.

I didn't wonder about matlab compatibility, now I get the point about py. :) I modified the code!

<Colin Macdonald-2> wrote:
> Re: single precision, it looks like scalars are treated differently than
> numpy arrays, which seems like a bug to me:

That's interesting, I'll try to give a look!

Thank you again for your availability,
Enrico
Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Francesco Faccio
In reply to this post by enricobertino
2017-04-16 11:15 GMT+02:00 enricobertino <[hidden email]>:
In the last release of Tensorflow (1.0.1), the import of the swig-generated
library "_pywrap_tensorflow" was not stable. There was a fix one month ago
in [1], and now using the dev version (1.1.0-rc1), I have no import problems
anymore. I will try in a different machine too.

[1]
https://github.com/tensorflow/tensorflow/commit/718812c9e4df55b8b3275aa4db7bb6833ed03111


Hi Enrico,

I tried to import Tensorflow (1.1.0-rc2) through Pytave and I got this error (I'm using Ubuntu 16.04.2 LTS):

>> py.tensorflow.VERSION
error: pycall: AttributeError: 'module' object has no attribute 'argv'
error: called from
    subsref at line 52 column 7

so it seems that there is still some work to be done in order to fix it. Can you please try to test this with Tensorflow 1.1.0-rc2? Have you already tested it in a different machine?
I suggest you to take a closer look into sys.argv, and the issues pointed out by Mike.

I confirm that this fix the problem for me:
>> py.sys.__dict__.__setitem__ ("argv", {""});
>> py.tensorflow.VERSION

but if I try to import TensorFlow before setting argv, then even with this workaround I have an error message:

>> py.tensorflow.VERSION
error: pycall: AttributeError: 'module' object has no attribute 'argv'

error: called from
    subsref at line 52 column 7
>> py.sys.__dict__.__setitem__ ("argv", {""});
>> py.tensorflow.VERSION
error: pycall: ImportError: cannot import name pywrap_tensorflow

error: called from
    subsref at line 52 column 7



Thank you for the test case you wrote, I'm interested in mnist_2k2k since it imports data from Octave. 
I tested your code and everything seems to work well, but I think you are doing some unnecessary copies of the dataset. Can you please try to keep only one copy of the dataset once you import it in Octave?

Francesco
Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Mike Miller-4
On Thu, Apr 27, 2017 at 01:02:28 +0200, Francesco Faccio wrote:

> I tried to import Tensorflow (1.1.0-rc2) through Pytave and I got this
> error (I'm using Ubuntu 16.04.2 LTS):
>
> >> py.tensorflow.VERSION
> error: pycall: AttributeError: 'module' object has no attribute 'argv'
> error: called from
>     subsref at line 52 column 7
>
> so it seems that there is still some work to be done in order to fix it.
> Can you please try to test this with Tensorflow 1.1.0-rc2? Have you already
> tested it in a different machine?
> I suggest you to take a closer look into sys.argv, and the issues pointed
> out by Mike.

I haven't looked into it too much, but it might be safest to just ensure
that sys.argv is set to something when the Python interpreter is loaded.

--
mike

Reply | Threaded
Open this post in threaded view
|

Re: Integrating Pytave and Nnet

Mike Miller-4
On Thu, Apr 27, 2017 at 07:41:53 -0700, Mike Miller wrote:

> On Thu, Apr 27, 2017 at 01:02:28 +0200, Francesco Faccio wrote:
> > I tried to import Tensorflow (1.1.0-rc2) through Pytave and I got this
> > error (I'm using Ubuntu 16.04.2 LTS):
> >
> > >> py.tensorflow.VERSION
> > error: pycall: AttributeError: 'module' object has no attribute 'argv'
> > error: called from
> >     subsref at line 52 column 7
> >
> > so it seems that there is still some work to be done in order to fix it.
> > Can you please try to test this with Tensorflow 1.1.0-rc2? Have you already
> > tested it in a different machine?
> > I suggest you to take a closer look into sys.argv, and the issues pointed
> > out by Mike.
>
> I haven't looked into it too much, but it might be safest to just ensure
> that sys.argv is set to something when the Python interpreter is loaded.

https://bitbucket.org/mtmiller/pytave/issues/83

--
mike