Posts Taged chainer

Preferred Networks achieved the world’s fastest training time in deep learning, completed training on ImageNet in 15 minutes,using the distributed learning package ChainerMN and a large-scale parallel computer

November 10, 2017, Tokyo – Preferred Networks, Inc. (PFN, Headquarters: Chiyoda-ku, Tokyo, President and CEO: Toru Nishikawa) has achieved the world’s fastest training time in deep learning by using its large-scale parallel computer MN-1 1.

With the size of training data and the number of parameters expanding for the sake of better accuracy of deep learning models, computation time is also increasing. It is not unusual to take several weeks to train a model. Getting multiple GPUs to link with one another for faster training is very important to reduce the time spent on trial and error and verification of new ideas, and produce results quickly.

On the other hand, it is generally known in parallel/distributed learning that the accuracy and learning rate of a model decrease gradually with increased GPUs, due to larger batch sizes and GPU communication overhead.

This time, we have improved learning algorithms and parallel performance to address these issues, and used one of Japan’s most powerful parallel computers with 1,024 of NVIDIA(R)’s Tesla(R) multi-node P100 GPUs and leverages Chainer’s distributed learning package ChainerMN 2 for training.

As a result, we completed training ResNet-50 3 for image classification on the ImageNet 4 dataset in 15 minutes, which is a significant improvement from the previously best known result 5.

The research paper on this achievement is available in the following URL under the title “Extremely Large Minibatch SGD: Training ResNet-50 on ImageNet in 15 Minutes”. (https://www.preferred-networks.jp/docs/imagenet_in_15min.pdf)

Based on this research result, PFN will further accelerate its research and development activities in the fields of transportation systems, manufacturing, and bio/healthcare, which require large-scale deep learning.

 

1 One of the most powerful private supercomputer in Japan, contains 1,024 of NVIDIA(R)’s Tesla(R) multi-node P100 GPUs.https://www.preferred-networks.jp/en/news/pr20170920

2 A package adding distributed learning functionality with multiple GPUs to the open source deep learning framework Chainer

3 A network frequently used in the field of image recognition

4 A dataset widely used for image classification

5 Training completed in 31 minutes using Intel(R) Xeon(R) Platinum 8160 x 1,600(Y. You et al. ImageNet Training in Minutes. CoRR,abs/1709.05011, 2017)

 

■ About the Chainer Open Source Deep Learning Framework

Chainer is a Python-based deep learning framework being developed mainly by PFN, which has unique features and powerful performance that allow for designing complex neural networks easily and intuitively, thanks to its “Define-by-Run” approach. Since it was open-sourced in June 2015, as one of the most popular frameworks, Chainer has attracted not only the academic community but also many industrial users who need a flexible framework to harness the power of deep learning in their research and real-world applications.

Chainer incorporates the results of the latest deep learning research. With additional packages such as ChainerMN (distributed learning), ChainerRL (reinforcement learning), ChainerCV (computer vision) and through the support of Chainer development partner companies, PFN aims to promote the most advanced research and development activities of researchers and practitioners in each field. (http://chainer.org/

■ About Preferred Networks, Inc.

Founded in March 2014 with the aim of promoting business utilization of deep learning technology focused on IoT, PFN advocates Edge Heavy Computing as a way to handle the enormous amounts of data generated by devices in a distributed and collaborative manner at the edge of the network, driving innovation in three priority business areas: transportation, manufacturing and bio/healthcare. PFN promotes advanced initiatives by collaborating with world leading organizations, such as Toyota Motor Corporation, Fanuc Corporation, and the National Cancer Center. (https://www.preferred-networks.jp/en/)

*Chainer(R) is the trademark or the registered trademark of Preferred Networks, Inc. in Japan and other countries.

 

 

Preferred Networks released open source deep learning framework Chainer v3 and NVIDIA GPU array calculation library CuPy v2

Preferred Networks, Inc. (PFN, Headquarters: Chiyoda-ku, Tokyo, President and CEO: Toru Nishikawa) has released Chainer v3, a major update of the open source deep learning framework Chainer(R), as well as NVIDIA(R) GPU array calculation library CuPy™ v2.

We release a major upgrade of Chainer every three months that quickly incorporates the results of the latest deep learning research. The newly released Chainer v3 will run without the need to change most of your code.

 

Main features of Chainer v3 and CuPy v2 include:

1.  Automatic differentiation of second and higher order derivatives

Chainer now supports automatic differentiation of second order and higher derivatives in many functions. This will enable users to easily implement deep learning methods that require second order differentiation as per equations written in papers.

 

2. Improved CuPy memory allocation

In many neural nets, memory efficiency when using GPUs will improve significantly, and reallocation of memory will be reduced in some cases, increasing speed.

 

3. Sparse matrix support has been added to CuPy

Large-scale graph analysis and natural language processing, which have previously been highly costly to implement on GPUs, can now be implemented more easily thanks to sparse matrix calculation being available on the GPU.

◆ Chainer ReleaseNote: https://github.com/chainer/chainer/releases/tag/v3.0.0

Chainer v3 has taken in a number of development results from external contributors as its previous versions did. PFN will continue working with supporting companies and the OSS community to promote the development and popularization of Chainer.

 

◆ About the Chainer Open Source Deep Learning Framework

Chainer is a Python-based deep learning framework developed by PFN, which has unique features and powerful performance that enables users to easily and intuitively design complex neural networks, thanks to its “Define-by-Run” approach. Since it was open-sourced in June 2015, as one of the most popular frameworks, Chainer has attracted not only the academic community but also many industrial users who need a flexible framework to harness the power of deep learning in their research and real-world applications.

Chainer incorporates the results of the latest deep learning research. With additional packages such as ChainerMN (distributed learning), ChainerRL (reinforcement learning), ChainerCV (computer vision) and through the support of Chainer development partner companies, PFN aims to promote the most advanced research and development activities of researchers and practitioners in each field. (http://chainer.org/)

*Chainer(R) and CuPyTM are the trademark or the registered trademark of Preferred Networks, Inc. in Japan and other countries.

Preferred Networks Launches one of Japan’s Most Powerful Private Sector Supercomputers

Features NTT Com Group’s Cloud-based GPU platform

TOKYO, JAPAN — This September, Preferred Networks, Inc. (PFN), a provider of IoT-centric deep learning systems, NTT Communications Corporation (NTT Com), the ICT solutions and international communications business within the NTT Group, and NTT Com subsidiary NTT PC Communications Incorporated (NTTPC) announced today the launch of a private supercomputer designed to facilitate research and development of deep learning, including autonomous driving and cancer diagnosis.

The new supercomputer is one of the most powerful to be developed by the private sector in Japan and is equipped with NTT Com and NTTPC’s Graphics Processing Unit (GPU) platform, and contains 1,024 of NVIDIA(R)’s Tesla(R) multi-node P100 GPUs. Theoretically, the processing speed of the new supercomputer can reach 4.7 PetaFLOPS—a massive 4,700 trillion floating point operations per second—the fastest levels of any computing environment in Japan.

Overview of the private supercomputer

 

PFN’s deep learning research demands an ultra high-speed, high capacity, state-of-the-art computing environment. Existing GPU platforms require massive electricity supplies, generate excessive heat and offer inadequate network speed. To address these issues, PFN adopted the NTT Com Group’s proven GPU platform, which boasts significantly advanced technology. They additionally leveraged the latest data center design, building a large-scale multi-node platform using ChainerMN, PFN’s technology that significantly accelerates the speed of deep learning by parallelizing calculations over multiple nodes.

NTT Com group has developed and released a multi-node GPU platform on Enterprise Cloud, Nexcenter(TM), a world-leading data center service, which incorporates the group’s extensive know-how in GPU performance maximization.

Following the supercomputer launch, PFN plans to increase the processing speed of its open source deep learning framework Chainer. They will further accelerate their research and development in the field of transportation systems, manufacturing, bio and healthcare industry which require a huge amount of computing resources. PFN will additionally consider the deployment of NVIDIA(R) Tesla(R) V100 GPUs, which are based on next-generation Volta GPU technology. NTT Com group will continue to support PFN’s research and the commercialization of their developed solutions with AI-related technologies and platforms.

“NVIDIA is excited to see the launch of Preferred Networks’ private supercomputer, built in partnership with NTT Com Group. Computing power is the source of competitive advantage for deep learning, the core technology of modern AI. We have high expectations that the new system will accelerate Preferred Networks’ business and contribute to Japan’s economic growth.”

Masataka Osaki
NVIDIA Japan Country Manager, Vice President of Corporate Sales

 

Related links:

Chainer
Enterprise Cloud
Nexcenter

 

◆ About Preferred Networks, Inc.
Founded in March 2014 with the aim of promoting business utilization of deep learning technology focused on IoT, PFN advocates Edge Heavy Computing as a way to handle the enormous amounts of data generated by devices in a distributed and collaborative manner at the edge of the network, driving innovation in three priority business areas: transportation, manufacturing and bio/healthcare. PFN develops and provides Chainer, an open source deep learning framework. PFN promotes advanced initiatives by collaborating with world leading organizations, such as Toyota Motor Corporation, Fanuc Corporation, and the National Cancer Center.
https://www.preferred-networks.jp/en/

◆ About NTT Communications Corporation
NTT Communications provides consultancy, architecture, security and cloud services to optimize the information and communications technology (ICT) environments of enterprises. These offerings are backed by the company’s worldwide infrastructure, including the leading global tier-1 IP network, the Arcstar Universal One™ VPN network reaching over 190 countries/regions, and 140 secure data centers worldwide. NTT Communications’ solutions leverage the global resources of NTT Group companies including Dimension Data, NTT DOCOMO and NTT DATA.
www.ntt.com | Twitter@NTT Com | Facebook@NTT Com | LinkedIn@NTT Com

◆ NTT PC Communications Incorporated
NTTPC Communications Incorporated (NTTPC), established in 1985 is a subsidiary of NTT Communications, is a network service and communication solution provider in Japanese telco market, the company has been the most strategic technology company of the group throughout of years. NTTPC launched the 1st ISP service of the NTT group, so called “InfoSphere” at 1995, and also launched the 1st Internet Data Center and server hosting services of Japan so called “WebARENA” at 1997. NTTPC have always started something new in ICT market.
http://www.nttpc.co.jp/english/

 

 

Notes
1. Chainer(R) is the trademark or the registered trademark of Preferred Networks, Inc. in Japan and other countries.

2. Other company names and product names written in this release are the trademarks or the registered trademarks of each company.

Preferred Networks officially released ChainerMN version 1.0.0, a multi-node distributed learning package, making it even faster with stablized data-parallel core functions

Tokyo, Japan, September 1, 2017 – Preferred Networks, Inc. (PFN, Headquarters: Chiyoda-ku, Tokyo, President and CEO: Toru Nishikawa) has released the official version 1.0.0 of ChainerMN※1, which is a package adding distributed learning functionality with multiple GPUs to Chainer, the open source deep learning framework developed by PFN.

For practical application of machine learning and deep learning technologies, the ever-increasing complexity of neural network models, with a large number of parameters and much larger training datasets requires more and more computational power to train these models.

ChainerMN is a multi-node extension to Chainer that realizes large-scale distributed deep learning by high-speed communications both intra- and inter-node. PFN released the beta version of ChainerMN on May 9, 2017 and this is the first official release. The following features have been added to ChainerMN v1.0.0.

● Features of ChainerMN v1.0.0

1. Increased stability in core functions during data parallelization

With this improved stability, ChainerMN can be used more comfortably.

2. Compatibility with NVIDIA Collective Communications Library (NCCL) 2.0.

By supporting the latest version, it has become even faster.

3. More sample code (machine translation, DCGAN) is available.

These examples will help users learn more advanced ways of using ChainerMN.

4. Expansion of supported environments (non-CUDA-Aware MPI).

CUDA – Aware MPI implementation such as Open MPI and MVAPICH was necessary for the beta version, but ChainerMN is now compatible with non-CUDA-Aware MPI.

5. Initial implementation of model parallelism functions.

More complex distributed learning has become possible by getting multiple GPUs to work in the model parallelism method.
The conventional data parallelism approach is known to limit the possible batch size when increasing the nodes while maintaining accuracy. o overcome this, we have done the initial part of the more challenging implementation of model parallelism for greater speed than possible with data parallelism.

 

These features provide a more stable and faster than ever deep learning experiences with ChainerMN, as well as improved usability.

The following is the result of the performance measurement of ChainerMN using the image classification dataset of ImageNet. It is about 1.4 times faster than the first announcement in January this year, and 1.1 times faster than the beta version released in May. Please visit the following Chainer Blog to learn more about the experiment settings:

https://chainer.org/general/2017/02/08/Performance-of-Distributed-Deep-Learning-Using-ChainerMN.html

 

In addition, from October 2017, ChainerMN will become available on “XTREME DNA”, an unmanned  cloud-based super-computer deployment and operation service, provided by XTREME Design Inc.(Head office: Shinagawa-ku, Tokyo, CEO: Naoki Shibata)

ChainerMN will be added on the distributed parallel environment templates for GPU instances of the pay-per-load public cloud, Microsoft Azure. This not only eliminates the need to construct infrastructure required for large-scale distributed deep learning but also makes it easy to manage research-and-development costs.

ChainerMN aims to provide an environment in which deep learning researchers and developers can easily concentrate on the main parts of research and development including the design of neural networks. PFN will continue to improve ChainerMN by adding more features and expanding its usage environment.

 

◆ The Open Source Deep Learning Framework Chainer (http://chainer.org)
Chainer is a Python-based deep learning framework developed by PFN, which has unique features and powerful performance that enables users to easily and intuitively design complex neural networks, thanks to its “Define-by-Run” approach. Since it was open-sourced in June 2015, as one of the most popular frameworks, Chainer has attracted not only the academic community but also many industrial users who need a flexible framework to harness the power of deep learning in their research and real-world applications.

 

※1:MN in ChainerMN stands for Multi-Node. https://github.com/pfnet/chainermn

Preferred Networks released Version 2 of Chainer, an Open Source framework for Deep Learning

New functions developed, including a significant increase of memory efficiency during learning

Tokyo, Japan, June 2, 2017 – Preferred Networks, Inc. (PFN, Headquarters: Chiyoda-ku, Tokyo, President and CEO: Toru Nishikawa) has released a major update of its open source deep learning framework Chainer, called Chainer v2.

This is the first major version update since the official release of Chainer in 2015, and it enables more powerful, flexible, and intuitive functions to implement and study deep learning methods.

With the rapid evolution of deep learning technology and an expanding field of target applications, user demands for the functionality of deep learning frameworks is rapidly changing and diversifying.

Chainer incorporates the results of the latest deep learning research. With additional packages such as ChainerMN (distributed learning), ChainerRL (reinforcement learning), ChainerCV (computer vision) and through the support of Chainer development partner companies, PFN aims to promote the most advanced research and development activities of researchers and practitioners in each field.

 

Chainer v2 has three major enhancements and improvements.

 

1. Improved memory efficiency during learning

Chainer v2 shows significantly reduced memory usage without sacrificing learning speed. It has been confirmed that the memory usage can be reduced by 33% or more when learning using the network ResNet50 used in the field of image recognition. This makes it easier to design larger networks and allows to learn using larger batch sizes in usual networks.

 

2. Chainer’s accompanying array library CuPy has been separated and made into an independent project, allowing a broader range of HPC applications to be easily accelerated using GPUs

The general-purpose array calculation library CuPy is highly compatible with library NumPy, which is very popular in the field of scientific computing, making it possible to run faster using the GPU without altering the code written for use with NumPy. By separating CuPy and developing it as a separate library, we aim to increase users for expanding our application not only in deep learning field but also in other research and development fields.

 

3. Organized the API and made it more intuitive

One of the major features of Chainer is its ability to intuitively describe a complex neural network as a program. We have taken into consideration the various use cases and needs of the community to remove unnecessary options and organize interfaces to provide a more sophisticated API. Due to a more intuitive description, unintentional bugs occur less frequently.

 

● Chainer Release Note: https://github.com/chainer/chainer/releases/tag/v2.0.0

● Chainer Upgrade Guide: https://docs.chainer.org/en/stable/upgrade.html

● Chainer Blog: https://chainer.org/announcement/2017/06/01/released-v2.html

 

The Chainer team plans to release major version updates every four months to support the most advanced research and development activities for researchers and practitioners in each field.

Development results of many external contributors are also included in the Chainer V2 release. PFN will continue to work with support companies and the OSS community to promote the development and dissemination of Chainer.

 

◆ Chainer Meetup # 05

Community event for developers and researchers who use Chainer.

  • Date: June 10, 2017, 14:00–18:30
  • Place: Microsoft Japan Co., Ltd. Shinagawa Office, Seminar Room A
    (Shinagawa Grand Central Tower 31f, 2-16-3 Konan, Minato-ku, Tokyo)
  • Application: https://chainer.connpass.com/event/57307/

 

◆ Kick-off for the Deep Learning Lab Community

The Deep Learning Lab is a community of professionals who are well versed in both technology and business to apply the latest deep learning technology to real business. Microsoft Azure and Chainer are used in key platforms/frameworks to disseminate information about use examples and the latest technology trends.

  • Date: Monday, June 19, 2017, 9:00-12:30
  • Place: Microsoft Japan Co., Ltd. Shinagawa Office
    (Shinagawa Grand Central Tower 31F, 2-16-3 Konan, Minato-ku, Tokyo)
  • Application: https://dllab.connpass.com/event/57981/

 

◆ About the Chainer Open Source Deep Learning Framework (http://chainer.org)

Chainer is a Python-based deep learning framework developed by PFN, which has unique features and powerful performance that enables users to easily and intuitively design complex neural networks, thanks to its “Define-by-Run” approach. Since it was open-sourced in June 2015, as one of the most popular frameworks, Chainer has attracted not only the academic community but also many industrial users who need a flexible framework to harness the power of deep learning in their research and real-world applications.

*Chainer(R) and DIMo(TM) are trademarks of Preferred Networks, Inc. in Japan and other countries.

Drawing app “pixiv Sketch” and automatic coloring service “PaintsChainer” collaborate to provide a new function for automatic coloring of illustrations!

Artificial Intelligence (AI) supports the “coloring” of sketches and illustrations by providing new functions to recognize faces, clothes, and background in the image and automatically filling them with color and shading.

Tokyo, Japan, 24 May 2017 – pixiv Inc. (President: Hiroki Ito, Headquarters: Shibuya-ku, Tokyo) and AI startup Preferred Networks, Inc.  (President & CEO: Toru Nishikawa, Headquarters: Chiyoda-ku, Tokyo, hereinafter referred to as PFN) collaborate to add the new function of automatic coloring, realized by “PaintsChainer”, to the drawing communication platform “pixivSketch”, available from Wednesday, May 24, 2017.

pixiv Sketch is a communication platform that allows users to post drawings directly from devices such as PCs and smartphones. Even when relaxing or playing outside with friends, users can paint anytime and anywhere and experience communication in real-time by posting and sharing their drawings.

The new functionality added to pixiv Sketch is realized using the technology of PaintsChainer that can automatically select painting colors, trained from pairs of line drawings and colored illustrations using Chainer, a deep learning framework developed and provided by PFN.

It allows the user to perform the important process of “coloring” when producing illustrations by selecting a picture drawn on pixiv Sketch or an external image file and then simply clicking the automatic coloring button. Face, clothing, and the background of the illustration are recognized by AI and colors are automatically added. You can also put your favorite color chosen from a color palette as a hint for automatic coloring at any point on the line drawing.

pixiv and PFN will continue to provide valuable services to make drawing and painting more natural and pleasant through AI technology and research.

◆ Automatic coloring function in pixiv Sketch

Release date: May 24th

Cost: free

URL: https://sketch.pixiv.net/ (Available only on Web version)

How to use the new function;

1. Draw a line drawing or select an image of a line drawing

2. Start the automatic coloring tool by pressing the “Automatic coloring” button

3. Select the coloring pattern of your choice from two different styles

4. If necessary, put color hints from the color palette to adjust the coloring

5. After specifying the colors, click the arrow button to complete the coloring process!

 

◆ pixiv Sketch  https://sketch.pixiv.net/

pixiv Sketch is a painting communication platform launched with the desire to “make everyday paintings more casual and fun”. It is a service where you can post images you’ve painted anytime, anywhere through devices such as PCs and smartphones.

 

PaintsChainer   https://paintschainer.preferred.tech/

PaintsChainer is developed and offered by PFN, and received had a great response on Twitter and other social media sites when the service was released in January 2017. Users can upload a black and white drawing file and have it colored automatically using deep learning technology. The user can also supply color hints to control the colorization results.

 

◆ About Preferred Networks, Inc. https://www.preferred-networks.jp/

Founded in March 2014 with the aim of business utilization of deep learning technology focused on IoT. PFN advocates Edge Heavy Computing as a way to handle the enormous amount of data generated by devices in a distributed and collaborative manner at the edge of the network and realizes innovation in the three priority business areas of the transportation system, manufacturing industry, and bio/healthcare.

PFN develops and provides solutions based on the Deep Intelligence in Motion (DIMo) platform that provides state-of-the-art deep learning technology. PFN promotes advanced initiatives by collaborating with world leading organizations, such as Toyota Motor Corporation, Fanuc Inc., National Cancer Research Center.

Preferred Networks and Microsoft have a strategic collaboration in the field of deep learning solutions

Tokyo, Japan, 23 May 2017 – Today, Preferred Networks, Inc. (Headquarters: Chiyoda-ku, Tokyo, President and CEO: Toru Nishikawa, hereinafter referred to as PFN) and Microsoft Corporation (Headquarters: Redmond, Washington, USA, CEO: Satya Nadella) have agreed to strategically collaborate in the field of deep learning solutions with the aim to accelerate the applications of artificial intelligence and deep learning in the business world.

Based on this alliance, both companies will promote cooperation between Microsoft’s public cloud platform Microsoft Azure and PFN’s deep learning technology to provide deep learning solutions for solving problems across a broad range of industries. Microsoft Corporation (Headquarters: Minato-ku, Tokyo, Representative Director: Takuya Hirano) will fully support the delivery of this collaboration to the Japanese market.

 

Through this collaboration, both companies will work together in the following three areas;
1) Technology, 2) Human resource development, 3) Marketing.

1.Technology:

  • Challenges that engineers face in deep learning include the increase of the time required to train complex neural networks, the growing management complexity associated with ever-increasing data, to remain flexible and adaptable to the rapid progress and innovation of deep learning, and the methodology of system development around deep learning. In this collaboration, with the aim of tackling these challenges, both companies will enhance the compatibility between Microsoft Azure IaaS and PFN’s deep learning framework Chainer, providing an Azure template to deploy Chainer/ChainerMN (MN stands for Multi Node) on Azure IaaS with a single click, Chainer to Data Science VMS, Chainer on Azure batch services and SQL Servers, and improving Chainer on Windows by the summer of 2017.

 

  • Currently, the standard way of developing neural networks is to develop from scratch. However, it needs high technical knowledge, and the amount of required investment is also very large. In order to drive the application of deep learning to the real world, it is essential to move from development from scratch to standardized solutions. To realize this transfer, Microsoft Azure Data + Analytics products and PFN’s deep learning platform, Deep Intelligence in-Motion (DIMo) are combined to provide solutions for specific workloads and industries throughout 2017. In addition, both companies will support and nurture partnerships in the development of these solutions to accelerate the broader implementation in the real world.

 

2.Human resource development:

  • The development of data science human resources is one of the main issues of applying deep learning to the real world. In order to resolve this issue, both companies will work together to provide training programs for university students, engineers and researchers throughout 2017. In addition, both companies will consider participation in data science related programs for human resource development, which are typically government organized, for higher education institutions.

 

  • Training programs include not only the basics of neural networks, but also advanced classes that teach how to actually apply deep learning to real-world business applications. Through these training programs, both companies plan to train 50,000 people in three years. As goals for the training, programs such as Imagine Cup and Azure for Research, which are among the world’s largest IT contests for students aiming for fostering international competitive IT talents are considered.

 

3.Marketing:

  • Deep learning is just one method in machine learning, but it is now exposed to many people as a related field of artificial intelligence. As a result, it is difficult for customers to determine whether or not deep learning is effective to solve their business problems. Both companies will start a customer workshop for each industry in the summer of 2017 based on the knowledge of deep learning business cultivated by Microsoft and PFN, and real success stories using Microsoft Azure, Chainer and DIMo.

 

  • By incorporating the latest deep learning technologies provided by Chainer and DIMo on a solid Azure platform, both companies provide an enterprise-grade end-to-end solution that can be incorporated into the customer’s core system, throughout 2017.

 

  • As a place of matching between customers who want to solve business problems with deep learning, and companies who provide consulting services and system development using deep learning, a community named “Deep Learning Lab” has been established, and the community will hold briefings on the days of June 19 and July 25, 2017.
    https://dllab.connpass.com/

Preferred Networks released ChainerMN, a multi-node extension to Chainer, an open source framework for deep learning

Tokyo, Japan, 9 May 2017 –

Today, Preferred Networks, Inc. (Headquarters: Chiyoda-ku, Tokyo, President and CEO: Toru Nishikawa, hereinafter PFN)  released ChainerMN (MN stands for Multi-Node, https://github.com/pfnet/chainermn), which can accelerate the training speed by adding a distributed learning function with multiple GPUs to Chainer, the open source deep learning framework developed by PFN.

Even though the performance of GPUs is continuously improving, the ever-increasing complexity of neural network models, with large number of parameters and much larger training datasets requires more and more computational power to train these models. Today, it is common that one training session takes more than a week on a single node of a state-of-the-art computer.

Aiming to provide researchers with an efficient way to conduct flexible trial and error iterations, while using large training data sets PFN developed ChainerMN, a multi-node extension for high-performance distributed training, built on top of Chainer. We demonstrated that ChainerMN finished training a model in about 4.4 hours with 32 nodes and 128 GPUs which would require about 20 days on a single-node, single GPU machine.

 

  • Performance comparison experiment between ChainerMN and other frameworks

https://research.preferred.jp/2017/02/chainermn-benchmark-results/

We compared the performance benchmark result of ChainerMN with those of other popular multi-node frameworks. In our 128-node experiments with a practical setting, in which the accuracy is not sacrificed too much for speed, ChainerMN outperformed other frameworks.

 

When comparing the scalability, although the single-GPU throughputs of MXNet and CNTK (both are written in C++) are higher than ChainerMN (written in Python), we found that the throughput of ChainerMN was the highest with 128 GPUs, showing that ChainerMN is the most scalable. This result was due to the design of ChainerMN which is optimized for both intra-node and inter-node communications.

 

Existing Chainer users can easily benefit from the performance and scalability of ChainerMN simply by changing a few lines of their original training code.

ChainerMN has already been used in multiple projects in a variety of fields such as natural language processing and reinforcement learning.

 

  • About the open source deep learning framework Chainer

Chainer is a Python-based deep learning framework developed by PFN, which has unique features and powerful performance that enables users to easily and intuitively design complex neural networks thanks to its “Define-by-Run” feature. Since it was open-sourced in June 2015, as one of the most popular frameworks, Chainer has attracted not only the academic community but also many industrial users who need a flexible framework to harness the power of deep learning in their research and real-world applications. (http://chainer.org/

 

  • About Preferred Networks, Inc.

Founded in March 2014 with the aim of business utilization of deep learning technology focused on IoT. Edge Heavy Computing handles the enormous amount of data generated by devices in a distributed and collaborative manner at the edge of the network and realizes innovation in the three priority business areas of the transportation system, manufacturing industry, and bio- healthcare.

PFN develops and provides solutions based on the Deep Intelligence in-Motion (DIMo, Daimo) platform that provides state-of-the-art deep learning technology. Collaborating with world leading organizations, such as Toyota Motor Corporation, Fanuc Inc., National Cancer Research Center,  we are promoting advanced initiatives.(https://www.preferred-networks.jp/en/

 

*Chainer(R) and DIMo(TM) are a trademark of Preferred Networks, Inc. in Japan and other countries.