|
The success of Bluetooth and WiFi has led to a significant shift in the way that we think about connectivity. We are finally finding effective ways to replace data cables and that is stimulating the development of new markets.
But as these markets develop, higher speeds and lower power are going to be needed as the current technologies are not fast enough for applications such as video streaming and moving large amounts of data.
A user connecting up devices such as digital cameras and iPods wirelessly to a PC or STB (Set Top Box) ideally wants to download his or her audio and video content as quickly as possiblewith instantaneous file transfer or real-time streaming content the ideal.
That next generation of wireless connectivity is currently emerging under the umbrella of UltraWideBand (UWB). New coding technologies at higher frequencies are being used to provide 480-Mbit/s links over short distances, allowing more data to be transferred in a shorter period of time, using less power per byte than before.
Click here for Figure 1
Figure 1: How a faster link saves powerdownloading 1Gbyte of data. Source: Alereon.
But there are significant challenges in the realms of technology, system,s standards and politics that must be addressed before this can really take off.
Technology
There are currently two implementations of UWB using two different technologies, and this has led to the standard stalling in the IEEE 802.15.3a standards committee and that specification being withdrawn.
This development has opened the field for two main rival technologies. One groupbacked by Freescale Semiconductor and the UWB Forumis using a Direct Spread Spectrum (DSS) approach. This allows the power of the radio stage to be scaled back under software to get the optimum link power, and is being used by companies such as Belkin and Powerlink for high speed cable replacement.
The other group, the Multiband OFDM Alliance (MBOA) is using OFDM coding similar to WiFi for its implementation. The group has worked closely with the UWB Implementers Forum (USB-IF) to develop a Certified WirelessUSB standard based on its specification and is backed by industry giants such as Intel and Texas Instruments, and startups such as Alereon and Staccato Communications. In addition, WiMedia has a worldwide standard with Ecma International, an ISO recognized standards body (Ecma-368 and Ecma-369).
Standards
The lack of an IEEE standard has slowed the allocation of frequencies for UWB globally, but these are starting to emerge. The US has allocated frequencies in the 3.1 GHz to 10.6GHz band, while the ECMA standards body in Germany has published specifications which propose frequencies for UWB that are expected to turn into international ISO standards next year.
This will give chip makers the confidence to manufacture millions of units to bring the cost down to levels that are suitable for the consumer market and for embedding in digital camcorders, video cameras, and any kind of portable electronic system that could use a high speed data links.
Systems
However, one of the key challenges is actually putting together a system. The power consumption constraints mean that a module is not power or cost effective, and the successful UWB implementations in either technology will be system-on-chip (SoC) designs.
But SoC design is not trivial, particularly to get the optimum power consumption, speed and small size to make it appropriate for the consumer market. One of the key areas is the processor core.
As the leading architecture for mobile platforms, ARM has a long history of providing the most power and area efficient RISC processor cores to power constrained applications. The same power and performance characteristics that make ARM the leading choice in the mobile market are those that most benefit wireless networking designs.
But there is also the SoC design flow to consider. Now the UWB MAC layer has to be integrated with the processor core and other peripherals such as crypto engines or host interfaces, or popular device interfaces. Reducing the time to market and the risk of developing such a system is vitally important, and this is where the experience and comprehensive solutions of a company like ARM shows key advantages.
Other alternatives such as a hardwired scheduler or non-standard controller may be smaller to implement but take considerable amount of time to develop and verify. With the uncertainty of new standards and application requirements, a programmable approach is significantly more useful, provided the system is easy to implement and design with.
For example, the WiMedia specification uses a standard UWB MAC combined with a Certified WirelessUSB MAC, and there can be substantial amounts of scheduling and packet set up and tear down needed, so a higher performance, low power processor is a key requirement.
The systems also require high speed content protection, and while there will be hardware encryption engines, these have to be controlled by a CPU processor. There may also be higher application level features such as pre and post processing of video and voice that can make use of the core to save valuable die area.
ARM's best offering in this space is the ARM968E-S, a core designed for embedded applications, which means it does not have the memory management unit that is used to support rich operating systems such as Linux and Windows Mobile. In addition to making the core smaller, a non-cached core is easier to integrate and more deterministic in the system, ideal for the hard real time processing requirements of the MAC layer.
Click here for Figure 2
Figure 2. Efficient System Design using ARM968E-S
The core measures just 0.59 mm2 at 268 MHz on TSMC 0.13 micron Artisan SAGE-HS cell libraries and can be easily dropped into a design, eliminating a significant part of the verification challenge and reducing time to market and risk.
The ARM968E-S includes the 'E' DSP extensions, and incorporates DSP type instructions such as count leading zeros (CLZ), saturating arithmetic, and single-cycle 16-bit multiplies to aid in packet processing.
Rather than cached memory, ARM968 uses dual banked tightly coupled data memory (D-TCM) and instruction memory (I-TCM) linked to core, which can be connected to a direct memory access (DMA) engine to offload the memory transfer management from the core.
This combination of the DMA and TCM is a key advantage for the system developer. The dual-port D-TCM enables arbitration between the processor and DMA port as the processor and DMA alternately access the D-TCMinterleaved on an even-odd-even-odd word boundary basis. This unique architectural feature enables the DMA port to move data blocks into the TCM without stalling the processor or consuming processor bandwidth.
The ability to transfer instruction and data to and from the TCMs without processor involvement gives significantly higher system-level performance for real-time data processing applications.
It can result in a major power saving as the system can be run at slower clock frequencies while maintaining responsiveness and without significantly impacting interrupt latency. At 240MHz on a 0.13um process, the core generates just 0.14 mW/MHz.
ARM968 also uses AHB-lite, a simplified version of AHB AMBA for single master systems. This does not need to support complex bus arbitration, which means the core’s bus interface can be smaller and simpler.
AHB-light is also a low latency bus with improved response time, thus reducing memory access wait time. The core has an AHB-lite master port for peripheral and memory accesses, and a second dedicated AHB-lite DMA slave port.
The DMA slave port can be run off of a separate clock from the core, allowing the designer to stop the core during DMA instruction or data transfers for further power savings.
Another key element of an ARM processor is the infrastructure of tools and software that support development. Virtual prototyping techniques such as ESL (Electronic System Level) simulation allow the SoC to be modeled in software with all the different elements before the design is started.
This allows the SoC designer to explore key design tradeoffs and model the systems performance, both of which are vital in deciding on an optimized bus and memory architecture for the chip.
A key question that comes up with SoC designers working on UWB is how to determine the throughput of the device and ensure memory and bus bottlenecks will not hinder the performance of their system. Largely determined by the memory and bus bandwidth, system level throughput can be examined by simulating the core with memory, PHY and host interfaces.
The PHY shares the bus with external memory, and cycle latency and bus arbitration analysis can help the designer understand the latency issues in the system and how to structure the system for maximum throughput. Using the performance modeling capabilities of ESL tools such as RealView SoC Designer (formerly MaxSim), designers can remove throughput bottlenecks in the early stages of the development.
Cycle accurate and cycle approximate SystemC models of the standard elements in the design, such as the ARM968, memories, memory controller and bus fabric, are available from ARM and partner companies. Missing custom elements based on in-house IP can be easily included though the creation of high level C models to simulate behavior, or by translating RTL designs into high level models with technology such as Carbon Design System’s SOC-VSP or TenisonEDA’s VTOC product.
ARM’s RealView ESL solutions are closely coupled with ARM’s RealView Developer Solutions, for compilation, debug and instruction set simulation, allowing software programmers and system designers to integrate software development and debug with the ESL environment. Much like software programmers use FPGA prototypes, this platform can be used to boot the OS, and write and validate critical driver and firmware code before the design reaches silicon.
This provides a base for software development to start much earlier in the design cycle. The models also form the basis of the test suites for hardware and software, significantly reducing the system integration phase of the design, and ideally producing devices that are as compliant to each of the specifications as possible and making UWB devices easily interoperable.
There is also a well established roadmap for low power core implementations. The ARM1156 synthesisable core is already well established in the smartphone market, and supports the Thumb-2 code compression technology to help reduce the memory requirements of the system code in the next generation devices, and there are new Cortex cores coming along such as the Thumb2-based Cortex-M3 and others that will provide optimal performance and power characteristics for a range of UWB applications.
Power and cost are key factors for UWB chip designs, and this means moving into the SoC arena, often for the first time for many UWB companies. A well established design flow with the tools to model, simulate, develop and test both hardware and software as concurrently as possible help dramatically in getting the chip to market at the right time with the best performance and at the right cost point.
Processor cores from ARM are helping many UWB designers achieve exactly this. A good example of how this strategy can succeed is Alereon's wireless USB designs.
Alereon flies high with ARM core
UWB chip designer Alereon in Austin, Texas, has used the ARM968 core for its digital MAC to provide key flexibility to handle the different physical layer protocols and to provide key system capabilities that just aren’t possible with dedicated hardware.
Alereon’s initial focus is on Certified WirelessUSB from the USB-IF using the WiMedia physical layer, aiming to help equipment makers replace the 1bn USB connection out in the market with a wireless version. The company, with 70 people and over 100 patents, has developed both a device chipset and a host front end.
Click here for Figure 3
Figure 3:WiMedia includes a wide range of protocols that can be supported on top of a UWB radio platform. Source: WiMedia Alliance.
The device chipset combines the AL4100 SiGe RF front end with the AL4300 combined slave AMC and baseband chip.
The company also sells the physical layer side on its own, again in two chips with the AL4100 RF front end and a high performance AL4200 baseband chip, to host makers who often have their own more complex host MAC implementations.
Click here for Figure 4
Figure 4:The Alereon range for Wireless USB clients and hosts. Source: Alereon.
For these, the company evaluated a range of processors, including the ARM7 and other embedded processors, but chose the ARM968 for its low power consumption and small size.
What also appealed to Alereon was the tightly coupled, single-cycle memory with the DMA engine, providing the real time performance.
Alereon has also developed key coding technology that gives the system much more flexibility. Alereon’s CogniPhy technology uses a series of hardware filters and digital signal processing across the PHY and MAC chips to analyze the quality of radio link that has been set, and change it if necessary to maintain the best possible link.
This is vital to keep the speed of transfer up and the power consumption down, particularly in ‘dirty’ areas with lots of interference. The data is taken from the filters and an algorithm running on the ARM968 determines whether to reduce the bit rate, change the frequency slots or use smaller packet to provide more error correction.
The core is also used for key regulatory capabilities in the chipset. Because the frequency bands allowed for UWB around the world are different, it can be hard for equipment makers to ensure that their equipment stays within the law. Alereon has a technique for 'sniffing' out other, different radio technologies that may be around to determine which are the equipment is in and therefore which frequency bands to use. This provides more robust links, but is changing regularly so a programmable solution is vital.
About the author
Charlene Marini is market technology specialist at ARM looks at the challenges of developing system-on-chip devices for UWB.
|