Friday, May 16, 2014

Manufacturing and Scaling Big Data

NOTE: This article appeared first in the May 14, 2014 IMTS Insider
By: Dave Edstrom
At [MC]2 Conference 2014, I had the privilege of speaking on the topic of Manufacturing and Scaling Big Data. In this talk, I spoke of multiple laws that all come together to provide context to the topic of my presentation and this article. The laws I spoke of and will talk briefly about here are O’Dell’s Law, Groundwater’s Law, Moore’s Law, Metcalfe’s Law, and of course you cannot write an article talking about laws if you do not try to coin one yourself, so I include Edstrom’s MTConnect Law. The net result is to use the laws to build a scenario on the future challenges of manufacturing and scaling big data.Some of the information in this article also comes from my book, MTConnect: To Measure Is to Know. The length of this month’s IMTS Insider is longer than most because of the complexity of the topic,as well as this my second to the last IMTS Insider and I wanted to provide readers with a deeper dive on technology.
O’Dell’s Law comes from Mike O’Dell. Mike is currently a Venture Partner at New Enterprise Associates (NEA) and has a very impressive resume and is an Internet pioneer. O’Dell’s Law is made up of two key components:
  1. Scaling is always the problem.
  2. If you are not afraid, you simply do not understand.
These are beyond brilliant. These are like the E=MC2 of computing. When you are designing any type of software system, it is easy to design to run with a small number of people, a small number of devices, or with a small amount of data. Take the same software and tell the developers that the requirements were off by five orders of magnitude. You were slightly off in the number of users. It was not 15 users, it was 1.5 million users. Ask them if they will need to change anything about their design. Simply put, scaling is always the problem.
The other issue is the unknown. Software has so many moving parts that it takes just one small “gee, I forgot about that” to break everything. It is always interesting to hear folks who are not in software say, “I don’t understand why this no longer works” when a new release comes out or “Why does software take so long to write?” Because to do it right, you have to hire really smart people with lots of experience; give them clear guidelines; the tools to design the software; and the time to get it all done, tested, and out the door. This is a non-trivial process.
A very old engineering saying, and one of my all time favorites, is “Fast, good or cheap. Pick any two. You can’t have all three.” Fast, good or cheap is the mantra of great consultants. Good consultants answer clients’ questions and great consultants question clients’ answers. That is an old adage as well. This is like the speed of light, it is not just a good idea, and it’s the law.
Groundwater’s Law is really made up of multiple laws.
  • •/* You are not expected to understand this */•
  • Everything you know is wrong
  • How do the little electrons know?
  • Do The Math
The first law, /* You are not expected to understand this */ comes from a comment by Dennis Ritchie in a very complex part of the Unix kernel (context-switching code in the V6 kernel aka Unix operating system). One of my favorite phrases of all time is"Everything You Know Is Wrong." As Wikipedia points out, Everything You Know Is Wrong is the eighth comedy album by the Firesign Theatre released in October 1974 on Columbia Records. Combined with “Do The Math,” these four laws can be combined under the umbrella, “stop and think this problem through.”
Moore’s Law is something the pundits on cable news pull out when they have their technology segments of a broadcast. Intel co-founder Dr. Gordon E. Moore wrote an article titled,“Cramming More Components onto Integrated Circuits,” which was published in Electronics magazine on April 19, 1965. This article has turned into the moral equivalent of Moses coming down from the mountaintop carrying the Ten Commandments of Electronics. In 1965, Dr. Moore was given the tough task at Intel of predicting what would happen in silicon design over the next decade. In this 1965 article, Dr. Moore basically stated the number of circuits on a chip would double every two years. This turned out to be incredibly prescient and accurate. Computer legend and friend of Dr. Moore, Carver Mead at Caltech, is credited with coining the term “Moore’s Law.”
When the Internet was first created, one of the initial discussion points was to answer the question, “if we are connecting computers to speak to each other, how many unique addresses are we going to need? ” An address is pretty much exactly what you think of an address. I can send a physical letter to my godfather Luverne Edstrom in Northfield, MN if I put in his correct address. It might sound obvious, but if there are two Luverne Edstrom’s in Minnesota, then the address for each must be unique. The same logic applies to the Internet. Instead of a physical address, the Internet uses logical addresses. For example, if you have ever setup a home router you know that is a typical address. What does this address mean? Each of the four numbers separated by a period can have a value of 0 to 255, or 8 bits (known as a byte) for a total of 32 bits. This means that there are roughly four billion addresses available. When the Internet was created that was deemed to be much more than could ever be needed. Keep in mind that there were not desktop computers, notebooks, iPhones, Androids, or Wi-Fi enabled scales in people’s homes in the late 1960s and early 1970s. The idea of four billion computers hooked up to the Internet was considered unimaginable!
Fast-forward to 2013 and we all know how this movie played out. Just in my home alone, I have 19 different devices that all have an IP address that are connected to the Internet. Yes, I might be a little more on the geeky side than most, but having 6 devices in the average home is very reasonable when you realize how many things have to be connected to the Internet to be useful.
The Internet Engineering Task Force (IETF) is a very forward-thinking organization and they decided almost twenty years ago that the Internet was going to eventually run out of IP addresses. To address this concern, they started working on a new version called IPv6. The difference between IPv4 and IPv6 is tremendous in terms of the number of available addresses. What does an IPv6 address actually mean? The IPv6 address size is 128 bits. The preferred IPv6 address representation is in 8 groups of 16 bits separated by the colon. For example, an IPv6 address might look like fe80:0000:69b8:c945:1031:3baf:fe0e:c843
What 128 bits means is that there are roughly 340 undecillion addresses available. The two most popular versions of IP are IPv4 an IPv6. Below are some address specifics of both.
  • Total Number of Internet Protocol (IP)
    • oIPv4 is 4,294,967,296
      • That’s 4 billion.
      • That’s 32 bits.
    • IPv6 is 340,282,366,920,938,463,463,374,607,431,768,211,456
      • That’s roughly 340 undecillion.
      • That’s 128 bits.
Let’s put IPv6 into proper perspective because I am sure that someone is thinking, “But Dave, 340 undecillion does not sound like a lot, will we run out of IPv6 addresses?” In the context of “never say never” when it comes to technology, I will give you one data point that should help you sleep at night if 340 undecillion does not sound like a big enough number for you. I was looking at a Cisco graphic that stated if we were to count up every single atom on planet Earth and if we were to start assigning IPv6 addresses to each atom, we would be able to give each and every atom 100 IPv6 addresses. You read that correctly. Every ATOM would have 100 IPv6 addresses. What if we find life on another planet and they want to speak to us using the Internet? What about Interplanetary networking aka InterPlaNet? Well, Vint Cerf and other brilliant individuals have already been working on that for some time as well.
Bob Metcalfe, the inventor of Ethernet, made a statement that has now become known as “Metcalfe’s Law.” Metcalfe’s Law basically states that the value of any network is the number of users or devices connected to the network squared. If we apply Metcalfe’s Law to manufacturing, we would modify it slightly to state: The value of any manufacturing shop floor’s network is the number of pieces of manufacturing equipment that can speak MTConnect squared. Why MTConnect squared and not just the number of pieces of manufacturing equipment squared? Because it is MTConnect that makes these pieces of manufacturing able to all speak the language of the Internet, which is XML. XML is an abbreviation for eXtensible Markup Language and it is the default Internet language today. By speaking the language of the Internet, it makes it extremely easy for software applications to talk to MTConnect-enabled manufacturing equipment.
Dr. Eric Topol wrote a ground breaking book titled, “The Creative Destruction of Medicine: How the Digital Revolution Will Create Better Health Care,” that discusses how remote sensors are going to cut down on visits to your doctor. I was listening to a podcast where Dr. Topol was discussing the use of sensors in the body that would speak to your smartphone and that data would then go to your doctor. These types of sensors might be able to predict a heart attack or stroke before they occur.
Now that we have established the perfect storm of a variety of laws with smaller, faster and cheaper technology combined with essentially unlimited IP addresses. You don’t have to be Vint Cerf or Bob Kahn, the two people that are appropriately credited with being the fathers of the Internet, to make the bold statement that every device will be connected to the Internet. The first person I heard that laid out the business case for IOT was John Gage of Sun Microsystems. John came up with the phrase, “The Network Is The Computer,” but it was also John and Bill Joy of Sun Microsystems who I first heard that said, with the technical specifics to back it up the claim, “everything will be connected to the Internet”. That was back in the mid 1980s.
On the third to the last slide, I stated the following:
  • The 256 exabytes of date will created in the year 2025
    • A large percent of that data will be sensor data
  • 300 zettabytes –300,000 exabytes of total storage around the globe
  • You will be carrying the equivalent of 64 iPhones in your pocket.

To put these numbers in perspective, the amount of printed material at the Library of Congress is 10TB. An Exabyte would be 100,000 Libraries of Congress. Specifically, an exabyte (EB) is1,152,921,504,606,846,976 bytes or a billion gigabytes, stated another way an EB is 67 millioniPhones of data.
Dr. Dean Bartles is well known in the manufacturing industry and sits on a number of boards, including the MTConnect Board. Dr. Bartles likes to discuss the need for a centralized DB that can be accessed anywhere that answers the question, “what’s the best way to make this part?” It’s an easy question to ask, but a very difficult question to answer. There are many issues to address, not the least of which is the protection of intellectual property (IP), but what is clear, the technology won’t be the limiting factor. Doug Woods, President for AMT –The Association For Manufacturing Technology, also enjoys discussions on brainstorming on what more could be done to help manufacturing. It is important to remember that Doug Woods was Chairman of AMT when the decision for AMT to invest in the vision of MTConnect was made back in late 2006, so Doug has a reputation for driving game-changing technology in manufacturing. Preparing for the [MC]2 2014 Conference, I thought, “what would it take to build the D2 (Dean and Doug) MTCorrect App?” The MTCorrect App would answer Dr. Bartles question.
The question of how to make the MTCorrect App will come from manufacturers, computer science and mechanical engineering departments, research and development departments in both in industry and academia. The key point that I brought up was that it will not be just about the data (both structured and unstructured) but it will be about the meta-data. Meta-data is data about data. For example, the NSA has been in the news regarding phone metadata. The phone data is the actual content of the calls. The metadata is information about the calls such as time of day, who you called, how long you spoke, etc. Manufacturing metadata will be the key to address what Dr. Bartles would like to see. It is not realistic to get the actual MTConnect data because that would allow anyone to reverse engineer a part. What could be shared would have to be made both anonymous and meaningful.
Below was my summary slide.
  • Step 1: Store lots of structured, unstructured data and meta-data.
  • Step 2: Sift through both to find patterns of correlation and causation.
  • Step 3: Present that data in the right format, at the right time to the right individual.
  • Step 4: Do it faster and better thananyone else.
  • Note: Step 4 is the “kids don’t try this at home” battleground for the next 10 years.
  • What company will be the "MTCorrect for Manufacturing Analytics"?
  • Now go build it Doug and Dean!
It is a very exciting time in manufacturing and big data will be a key component for as far as I can see. It was a real privilege to be at [MC]2 2014 and it was a great conference!
It was a fantastic three and half years being the President and Chairman of the Board for the MTConnect Institute. I would like to thank the MTConnect Community, MTConnect Technical Advisory Group (MTCTAG) members, MTConnect Board of Directors, and AMT–The Association For Manufacturing Technology for all of their help, support and guidance over the years.
Finally, I want to offer my since thanks to Doug Woods, President of AMT. It was Doug who gave me the opportunity to come to AMT as a consultant back in early 2010 to help out in three areas – MTConnect, what would become MTInsight and future technologies. I hope I helped move the MTConnect ball forward during my term and I look forward to working with all of those involved in MTConnect for many years to come. Doug has been a real mentor and friend; and I cannot thank him enough for such a fantastic opportunity and great memories.
For questions or comments, Dave Edstrom can be found at Virtual Photons Electrons.