Google Coral Edge TPU

Google Coral Edge TPU通過代理店Gravitylink向け全球銷商!

Google Coral Edge TPU USB Accelerator to Get Started Experience

2020-06-28 12:05:43 | Google AIY
Coral Beta
TPU, also known as Tensor Processing Unit is mainly used by Google data center. For general users, you can use it on Google Cloud Platform (GCP), or you can use Google Colab to use the free version.
Google first demonstrated their Edge TPU at the 2019 International Consumer Electronics Show (and this year's TensorFlow Development Summit), and then released the Coral Beta in March.
The Beta version includes development boards and USB accelerators, as well as preview versions of PCI-E accelerators and modular systems (SOM) for production purposes.
USB Accelerator
The Edge TPU USB Accelerator is basically the same as any other USB device, similar to Intel's MyriadVPU, but more powerful. Next, let's unpack and take a look.
Unpacking
The box contains:
Getting started
USB accelerator
Type C USB data cable
Getting started
The getting started describes the installation steps, so that you can complete the installation quickly. All required files, including model files, can be downloaded on the official website together with the installation package. The installation process does not require TensorFlow or OpenCV dependent libraries.
Tip: You must use Python 3.5, otherwise the installation cannot be completed. You also need to change the last line of the install.sh file python3.5 setup.py develop-user to python3 setup.py develop-user.
Demo program
The Coral Edge TPU API document includes an overview of image classification and object detection, as well as a demo program.
Edge TPU API
Before completing the following tutorial, there are the following notes about the Edge TPU API:
You need to install the Python edgetpu module to run the TensorFlow Lite model on the Edge TPU. It is a higher-level API that contains some simple APIs to perform the model inference process.
These APIs are pre-installed on the development board, but if you use a USB accelerator, you need to download it yourself. Please refer to this setup guide for details.
The following key APIs are used in the inference process: ClassificationEngine for image classification, DetectionEngine for target detection, and ImprintingEngine for transfer learning.
Image classification
Demo to implement image classification is very simple.
Target Detection
As with image classification, we only need to call the DetectionEngine interface to detect the target in the input picture and use the box to identify it.
Since the default configuration will produce false negatives, we can adjust the threshold in the default sample program from 0.05 to 0.5, and adjust the width of the rectangle to 5.
Since Coral is still only in beta, the details given in the API documentation are not complete. But the parts given so far are sufficient for the above example.
Precautions
All the codes, models and annotation files of the above demo are downloaded on the official website together with the library files included in the installation package. Based on the models and input annotation files that have been given so far, we can complete the classification and detection tasks.
For classification tasks, the results return the top 2 predicted categories and corresponding confidence scores. For target detection tasks, the results will return the confidence scores and the coordinates of the vertices of the labeled boxes. If the category label is given when input, the returned result also contains the category name.
Further development
With the help of Edge TPU, what other products can Coral offer?
As a development board, Google prefers NXP i.MX 8M SOC (Quad-core Cortex-A53 and Cortex-M4F).
If it is used for experiments, especially when only Edge TPU is required, we recommend USB Accelerator.
Follow-up development
What if you have already made a good prototype using the DevBoard or USB Accelerator, but you need to apply the same code to a large-scale production environment in the future?
Google has thought about this in advance. As you can see in the product list, the following modules will be used for enterprise support and have been marked as coming soon.
This is a fully integrated system (including CPU, GPU, Edge TPU, Wifi, Bluetooth and security elements), using a pluggable module with a size of 40mm*40mm.
This module can be used for large-scale production. Manufacturers can produce their favorite IO boards according to the guidelines provided by this module. Even the above-mentioned development board (Dev Board) contains this detachable module. Theoretically,it can be used as long as it is removed.
There is very little information about the PCI-E accelerator, but as the name suggests, it is a module with PCI-E (Peripheral Component Interconnect Express).And there are two variants, which is similar to the USB accelerator. But the difference is that the USB interface is replaced by PCI-E, just like a memory stick or a network card.
With the birth of various peripheral modules, it is expected that some enterprise-level projects will also be born. Google Coral thinks so too, with the following statement on their website:
Flexible and easy to use, precise cutting, suitable for startups and large enterprises.
Tensorflow and Coral project
Google's products are mostly related to Tensorflow. At present, Edge TPU only supports the traditional Tensorflow Lite version of the model, and the stable version of Tensorflow Lite has just been released.
Currently, you need to convert a tflite model to a tflite-tpu model through a web compiler. Don't worry if you are using PyTorch or other frameworks, you can use ONNX to convert the model to a Tensorflow model.

Edge AI performance: Google Edge TPU chip Vs. NVIDIA Nano

2020-06-23 11:35:28 | Google AIY
Hardware
The device I am interested in is the new NVIDIA Jetson Nano (128CUDA) and Google Coral Edge TPU (USB accelerator). And I will also test i7-7700K+GTX1080 (2560CUDA), Raspberry Pi 3B+, and my old friend, a 2014 macbook pro contains an i7-4870HQ (without CUDA-capable kernel).
Software
I will use MobileNetV2 as a classifier for pre-training on the imagenet dataset. And I will use the model directly from Keras, TensorFlow backend, the GPU's floating-point weights, as well as the 8-bit quantitative tflite version of the CPU and Coral Edge TPU.
First, loading the magpie model and image. Then I perform 1 prediction as a warm-up (because I notice that the first prediction is always much slower than the next prediction). I let it sleep for 1 second, so all threads must be finished. Then the script runs for it and classifies the same image 250 times. By using the same image for all classifications, we ensure that it will remain close to the data bus throughout the test. After all, we are interested in the speed of inference rather than the ability to load random data faster.
It is different for the score of the quantified tflite model with CPU. But it always seems to return the same prediction as others. So I think this is strange in the model, and I'm pretty sure it won't affect performance.
Now, because the results of different platforms are so different. It is difficult to imagine, so here are some charts, choose your favorite...
Analysis
There are 3 bar graphs jumping into the view in the first picture. (Yes, the first picture, the linear scale fps, is my favorite. Because it shows the difference in high-performance results.) Of these 3 bars, 2 of them are implemented by Google Coral Edge TPU USB accelerator, The third is the full NVIDIA GTX1080 assisted by Intel i7-7700K.
Looking carefully, you will see that GTX1080 is actually defeated by Coral. Let it sink for a few seconds, and then be ready to be blown away. Because the maximum power of GTX1080 is 180W, which is absolutely huge compared to Coral2.5W.
What we see next is that Nvidia’s Jetson Nano score is not high. Although there is a GPU that supports CUDA, it is actually not much faster than my original i7-4870HQ. But this is the problem, "not faster". It is still faster than the 50W, quad-core, hyper-threaded CPU. Jetson Nano has never consumed a short-term average of more than 12.5W, because this is my motivation. Power consumption is reduced by 75% and performance is improved by 10%.
Obviously, its own Raspberry Pi is not an impressive thing. It is not a floating point model, and it is still not useful for quantitative models. But hey, anyway I have the file ready. And it will run the test. The more the better? And it's still an interesting thing. Because it shows the difference between the ARM Cortex A53 in Pi and the A57 in Jetson Nano.
NVIDIA Jetson Nano
Jetson Nano does not use the MobileNetV2 classifier to provide impressive FPS rates.But as I have already said, this does not mean that it is not a very useful project. It's cheap and it doesn't require a lot of energy to run. Perhaps the most important attribute is that it runs TensorFlow-gpu (or any other ML platform), just like any other machine you have been using before.
As long as your script is not deep into the CPU architecture, you can run the exact same script as i7 + CUDA GPU, or you can train! I still think NVIDIA should use TensorFlow to pre-load L4T. But I will try not to be angry anymore. After all, they have a good explanation on how to install it (don't be fooled, TensorFlow 1.12 is not supported, only 1.13.1).
Google Coral Edge TPU
Edge TPU is also called "ASIC" (application specific integrated circuit), which means it has a combination of small electronic components such as FET and the capacity burned directly on the silicon layer, so that it can fully realize what it needs to do is Speed up reasoning.
Infer, yes, the Edge TPU cannot perform backward propagation.
The logic behind this sounds more complicated than it is now. (Actually creating hardware and making it work is a completely different thing, and very, very complicated. But the logic function is much simpler). If you are really interested in the way it works, you can check "Digital Circuit" and "FPGA", you may find enough information to keep you busy in the next few months. Sometimes it’s complicated at first, but it’s really fun!
But this is exactly why Coral is so different when comparing performance/wattage. It is a bunch of electronic devices designed to perform the required bitwise operations, with virtually no overhead.
Why doesn't the GPU have an 8-bit model?
The GPU is essentially designed as a fine-grained parallel floating-point calculator. Therefore, the use of floating is exactly what it creates and its advantages. The Edge TPU is designed to perform 8-bit operations, and the CPU has a faster method than 8-bit content that is faster than full-bit wide floating-point numbers because they must deal with this problem in many cases.
Why choose MobileNetV2?
I can give you many reasons why MobileNetV2 is a good model, but the main reason is that it is one of the pre-compiled models provided by Google for Edge TPU.
What other products does Edge TPU have?
It used to be different versions of MobileNet and Inception, but as of last weekend, Google introduced an update that allows us to compile custom TensorFlow Lite models. But the limitation is, and may always be the TensorFlow Lite model. This is different from Jetson Nano, that thing can run anything you imagine.
Raspberry Pi + Coral compared to others
Why does Coral look much slower when connected to Raspberry Pi? The answer is simple and straightforward: Raspberry Pi only has a USB 2.0 port, and the rest have USB 3.0 ports. And since we can see that the i7-7700K is faster on Coral and Jetson Nano, but still has not got the score of the Coral development board in NVIDIA test, we can conclude that the bottleneck is the data rate, not the Edge TPU.
I think it’s long enough for me, and maybe it’s the same for you. I am very shocked by the powerful features of Google Coral Edge TPU. But for me, the most interesting setting is the combination of NVIDIA Jetson Nano and Coral USB accelerator. I will definitely use this setting and it feels like a dream.
Speaking of Dev Board of Google Coral, and Edge TPU, then by the way, Model Play based on Coral Dev Board. It is developed by a domestic team and is an AI model sharing market for global AI developers. Model Play not only provides a platform for AI model display and communication for global developers, but also can be used with Coral Dev Board with Edge TPU to accelerate ML inference, preview the running effect of the model in real time through mobile phones, and help AI to expand from prototype to product.
Developers can not only publish their trained AI models, but also subscribe and download the models they are interested in, to retrain and expand their AI ideas, and realize the idea-prototype-product process. Model Play also presets various commonly used AI models, such as MobileNetV1, InceptionV2, etc., and supports the submission and release of retrainable models to facilitate users to optimize and fine-tune their own business data.
Just like Google at this year's I/O conference, the developers were called to jointly contribute to the development community. At the same time, the Model Play team is also issuing AI model convening orders to developers around the world, soliciting deep learning models based on TensorFlow that can run on the Google Coral Dev Board to encourage more developers to participate in the activities. Ten thousand AI developers share ideas together.

April update from Coral

2020-06-19 10:25:12 | Google AIY
Beta API for model pipelining with multiple Edge TPUs
Tested by the Coral Team
We've just released an updated Edge TPU Compiler and a new C++ API to enable pipelining a single model across multiple Edge TPUs. This can improve throughput for high-speed applications and can reduce total latency for large models that cannot fit into the cache of a single Edge TPU. To use this API, you need to recompile your model to create separate .tflite files for each segment that runs on a different Edge TPU.
Here are all the changes included with this release:
The Edge TPU Compiler is now version 2.1.
You can update by running sudo apt-get update && sudo apt-get install edgetpu, or follow the instructions here.
The model pipelining API is available as source in GitHub. (Currently in beta and available in C++ only.) For details, read our guide about how to pipeline a model with multiple Edge TPUs.
New embedding extractor models for EfficientNet, for use with on-device backpropagation.
Minor update to the Edge TPU Python library (now version 2.14) to add new size parameter for run_inference().
New Colab notebooks to build C++ examples.
Send us any questions or feedback at cynthia@gravitylink.com

Manufacturing more efficiently

2020-06-16 12:26:53 | Google AIY
Local AI can maximize throughput and increase safety in manufacturing processes
Edge AI use cases in industries are wide ranging, from quality control in manufacturing lines to safety monitoring of human-machine interaction. These applications need fast, low latency inference without compromising on accuracy.
QUALITY CONTROL
Imagine spotting defects the human eye can’t see.
Quality control in manufacturing can be complex, especially where high precision is needed. Components defects are difficult or impossible for human eyes to see, which makes the error rate on this type of inspection very high.
Missing defects can be costly and industries are deploying local AI at a rapid pace. Coral can enable visual inspection systems that can detect faults with high accuracy in situations where human vision falls short.
PREDICTIVE MAINTENANCE
Imagine extending the operating life of expensive machines.
Downtime of a production line or critical machine can lead to slower production, costly repairs, or even catastrophic failure.
With Coral, equipment manufacturers can incorporate features that monitor and analyze machine behavior and warn of impending failures. That can inform a system of predictive maintenance to avoid expensive downtime.
WORKER SAFETY
Imagine a worksite that can see accidents before they happen.
Many worksite injuries are due to preventable accidents — workers falling, failing to see heavy equipment, or failing to be seen by machinery.
Using Coral enabled cameras and other local sensors monitoring a job site, operators can give robots and vehicles the ability to operate safely alongside human workers, preventing collisions and making collaborative work with machines a reality.
Incident and avoidance data pooled into predictive models allow site managers to anticipate activities that may prove dangerous and make process changes.

AAEON Technology Inc. and Gravity Link Technology co., LTD Have Successfully Reached Cooperation

2020-06-10 15:30:57 | Google AIY
Shenzhen Gravity Link Technology Co., Ltd. successfully signed a contract with Taiwan ASUS AAEON Technology. It meansGravity Link becomes a long-term supplier of AAEON technology. Through this cooperation, the two sides hope towrite a new chapter of mutually beneficial and win-win cooperation!
As a global distribution partner of Google AI chipset, Gravity Link has made great progress along the way, launched sales in more than 40 countries and regions around the world and successfully reached cooperation with many well-known enterprises. Brand strength is the key empowerment and one of the core competitiveness. It can be seen that the cooperation between Gravity Link and AAEON is a general trend, which can not only provide mutual benefit, but also promote the rapid development of the artificial intelligence industry.
Taiwan ASUS AAEON Technology is a leading manufacturer of advanced industrial and embedded computing platforms. It is committed to innovative projects and provides integrated solutions, hardware and services to world-class OEM/ODM and system integrators. AAEON is continuing to pursue innovation and excellence, and joined the ASUS Group in 2011. With ASUS Group's advanced technology support and abundant resources, AAEON's leadership in the industry has been strengthened. Meanwhile, Gravity Link is a Chinese incubator operating partner officially authorized by Google in the United States. It has a complete product line and excellent resources, which can greatly meet the development needs of AAEON and provide high-quality coral chips to maximize the value of the product.
Gravitylink has been working hardly in the AI industry for many years. With its profound experience in the industry, sincere cooperation concepts and forward-looking strategic thinking, Gravity Link has become a trustworthy brand for industry partners. The potential of the AI industry is unlimited, and the future is expected. Gravity Link has always committed to helping every enterprise that loves AI, waving its flag. In this open, diverse, and creative era, Gravity Link, living up to its original intention, will join hands with more people of insight and journey together for a broader and smarter future!