Gsoc 2020 Idea Discussion

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Gsoc 2020 Idea Discussion

Atharva Dubey
Respected Sir
mlpack is an open-source software as, according to their distribution clause, they state that the distribution of MLpack source/binaries is permitted as it is with the BSD license with or without modifying the code. Nvidia's cuDNN library is an SDK and can be used without any potential ways of violating the licensing terms as it says so on their licensing page (https://docs.nvidia.com/deeplearning/sdk/cudnn-sla/index.html#general)

Also, Cuda binaries or header files need not be distributed from our side. The user would need to have Cuda cuDNN installed on his system beforehand. So is the case with popular libraries like TensorFlow and PyTorch. 

Also, this would not be a workaround or hack because when a person installs the toolbox, he/she would be getting the runtime files and precompiled binaries. So, for example, let's say we have a function to train the model saying trainModel(Param1, Param2, Param3, Cuda). When this function is called from the Octave interface, Octave would call a C++ file, actually telling what do (this is what I meant by a c++ backend). If the Cuda parameter is True, the model would load in the GPU and speed up the process. This is where the role of Cuda libraries come and Cuda SDK would come into the picture. Since we would not be distributing the library or any file in any form, licensing cannot be an issue. When Cuda is true, the control would be passed onto the Cuda files which would be preinstalled into the user's system. 

As mentioned, parallelization libraries are architecture-specific. Using Nvidia GPUs for DL/ML tasks is the industry norm and every library/software(like Matlab) provides it. Therefore I thought giving Cuda support to octave would be nice. 

Please share your thoughts on it. 
Thanks and Regards 
Atharva Dubey
Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

John W. Eaton
Administrator
On 2/25/20 9:53 AM, Atharva Dubey wrote:

> Therefore I thought giving
> Cuda support to octave would be nice.

It would be better to have free software that allows us to work with
this class of specialized hardware.

> Please share your thoughts on it.

I recommend that you begin by reading the "Combining work with code
released under the GNU licenses" section of the GPL FAQ.

https://www.gnu.org/licenses/gpl-faq.en.html

jwe

Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Matthias W. Klein
On 25.02.20 18:12, John W. Eaton wrote:

> On 2/25/20 9:53 AM, Atharva Dubey wrote:
>
>> Therefore I thought giving Cuda support to octave would be nice.
>
> It would be better to have free software that allows us to work with
> this class of specialized hardware.
>
>> Please share your thoughts on it.
>
> I recommend that you begin by reading the "Combining work with code
> released under the GNU licenses" section of the GPL FAQ.
>
> https://www.gnu.org/licenses/gpl-faq.en.html
>
> jwe
>
I am also not a lawyer and cannot discuss licensing.

Concerning the use of accellerating hardware, an existing free standard
is OpenCL [1].  As of 2020, there are various free and open source
implementations of the OpenCL standard around [2]. Besides them, there
are many proprietary ones, including Nvidia/Cuda ones.

As of end of last January, the ocl package [3] is part of octave forge. 
It provides a (limited) support for OpenCL in octave.  The user can
choose which of the above mentioned implementations to (manually install
and) use with ocl.

I have been using the ocl package also with Nvidia GPU hardware.

@Atharva: This post only concerns the hardware backend.  However, you
might want to consider if your ML/DL development in octave may be suited
to be based on an existing (but rather new) octave package.

Best,
Matt

[1] https://www.khronos.org/opencl/
[2] https://en.wikipedia.org/wiki/OpenCL#Open_Source_implementations
[3] https://octave.sourceforge.io/ocl/index.html


Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Atharva Dubey
Respected Sir, 
I would drop the idea of providing hardware acceleration support for now and first focus on giving an ML/DL package to octave and then I would move on to provide GPU based acceleration.
Thanks and Regards
Atharva Dubey
Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Benson Muite
In reply to this post by Atharva Dubey
MLpack seems good. Perhaps check what they have in plan for it. OpenCL and SYCL (https://www.khronos.org/sycl/
) are possible languages that offer portability for accelerators. You may wish to check whether MLPack developers have interest in using one of these since it may be easier to add the acceleration directly in MLPack. Perhaps also obtain an estimate of how long it would take to integrate MLPack in Octave. If one can export models in ONNX (https://onnx.ai/) format as well, this would allow using other frameworks when needed.


Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Atharva Dubey
Respected Sir,
Giving support for ONNX would be nice as it provides a lot of flexibility. I am not that familiar with ONNX as I have not used it before but I will start reading about it right away. 
For integrating MLPack in Octave, I thought of dividing the project into 3 sections. First would be to integrate all machine learning functions with octave and make a package out of it. Then move on to the Deep Learning modules followed by the ONNX support. I am planning to integrate the machine learning modules by the first evaluation, deep learning ones by the 2nd and then move onto ONNX ; i.e. a month for each part of my project. 
Thanks and regards
Atharva Dubey

Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Benson Muite
Hi Atharva,
One other project that may also be interesting is TVM (https://tvm.apache.org/ ) which can also provide interoperability between different frameworks. MX-NET looks like it will deprecate ONNX for TVM.
Benson


Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Atharva Dubey
Respected Sir, 
Sorry for the late reply as my midterms are going on.  TVM and ONNX are for different concepts (I am not so sure about this, though). TVM is used for deployable deep learning models that have to go into production and compiled for different architectures. In contrast, ONNX is more like to provide a common basis where one can let's say import a Keras or Caffe model into PyTorch. Please correct me if I am wrong about this. So maybe we can start off by giving inter framework support using ONNX so that one can shift frameworks if needed. 

Sincerely 
Atharva Dubey

Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Benson Muite
On Thu, Mar 19, 2020, at 6:11 AM, Atharva Dubey wrote:
Respected Sirs, 
For GSOC 2020, as discussed, I would be going ahead with integrating MLPack and using SYCL(openCL parallelization library) for acceleration. I am planning first to integrate MLPack with octave and then start building SYCL support for MLPack. Before drafting the proposal, any insights or approaches or the scope of the project could be reviewed. I just wanted to run it by you before I submit the proposal. 

Sincerely 
Atharva Dubey

Hi Atharva,
Have you interacted with MLPack developers? Did you get any response from them? What about import and export of models?
Regards,
Benson
Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Atharva Dubey
Respected Sir, 
MLPack has an existing ML framework, and now with each release, they are working towards optimizing their current algorithms like grid search and SGD, etc. Also, they plan to diversify their bindings in other languages like C#, Java, etc. and go for better parallelization. 

So we can probably go ahead with integrating MLpack, and then with each release, it would be a task of code-maintenance. 

Regarding the import and export of models, I am inclining towards export/import them as ONNX models as TVM stack is used for compiling and shipping models for hardware implementations. 

Sincerely, 
Atharva Dubey


Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Benson Muite
On Thu, Mar 19, 2020, at 5:40 PM, Atharva Dubey wrote:
Respected Sir, 
MLPack has an existing ML framework, and now with each release, they are working towards optimizing their current algorithms like grid search and SGD, etc. Also, they plan to diversify their bindings in other languages like C#, Java, etc. and go for better parallelization. 

So we can probably go ahead with integrating MLpack, and then with each release, it would be a task of code-maintenance. 

Regarding the import and export of models, I am inclining towards export/import them as ONNX models as TVM stack is used for compiling and shipping models for hardware implementations. 

Sincerely, 
Atharva Dubey


Hi Atharva,

It may be helpful to interact with the MLPack development community to get feedback on your plan:



Regards,
Benson

Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Benson Muite

Hi Atharva,

It may be helpful to interact with the MLPack development community to get feedback on your plan:



Regards,
Benson

Hi Athrava,

May also be of interest:

Regards,
Benson
Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Atharva Dubey
In reply to this post by Benson Muite
Respected Sir, 
I did mail to the mentor from MLPack who would have overseen their automatics bindings program to different languages - but did not get any response from him. Before submitting my draft proposal, any feedback from you would be very helpful. 

Sincerely 
Atharva Dubey

On Thu, Mar 19, 2020 at 11:06 PM Benson Muite <[hidden email]> wrote:
On Thu, Mar 19, 2020, at 5:40 PM, Atharva Dubey wrote:
Respected Sir, 
MLPack has an existing ML framework, and now with each release, they are working towards optimizing their current algorithms like grid search and SGD, etc. Also, they plan to diversify their bindings in other languages like C#, Java, etc. and go for better parallelization. 

So we can probably go ahead with integrating MLpack, and then with each release, it would be a task of code-maintenance. 

Regarding the import and export of models, I am inclining towards export/import them as ONNX models as TVM stack is used for compiling and shipping models for hardware implementations. 

Sincerely, 
Atharva Dubey


Hi Atharva,

It may be helpful to interact with the MLPack development community to get feedback on your plan:



Regards,
Benson

Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Benson Muite


On Wed, Mar 25, 2020, at 5:32 AM, Atharva Dubey wrote:
Respected Sir, 
I did mail to the mentor from MLPack who would have overseen their automatics bindings program to different languages - but did not get any response from him. Before submitting my draft proposal, any feedback from you would be very helpful. 

Sincerely 
Atharva Dubey


Hi Atharva,
Regards,
Benson
Reply | Threaded
Open this post in threaded view
|

Re: Gsoc 2020 Idea Discussion

Atharva Dubey
Respected sir, 
I did try to contact the MLpack developers through various channels but did not get any response from them probably because I contacted them this near to the proposal submission date. I will go ahead with the proposal and will submit the draft by tonight.
Sincerely, 
Atharva Dubey


On Wed, Mar 25, 2020 at 11:05 AM Benson Muite <[hidden email]> wrote:


On Wed, Mar 25, 2020, at 5:32 AM, Atharva Dubey wrote:
Respected Sir, 
I did mail to the mentor from MLPack who would have overseen their automatics bindings program to different languages - but did not get any response from him. Before submitting my draft proposal, any feedback from you would be very helpful. 

Sincerely 
Atharva Dubey


Hi Atharva,
Regards,
Benson