Configurable media processor: wireless multimedia solutions

Wireless multimedia relies on complex video software/server technology, and the generation of streaming video and audio relies on complex processing technology. There is a product that will undoubtedly benefit from wireless multimedia technology, which is the personal digital assistant (PDA). However, the microprocessors used by different PDAs have different performance levels. For example, some support low frame rate video streams and use software to process low resolution, while others do not support any type of video stream. A PDA system that can support high-quality two-way video communication requires strong system computing power.


Figure 1 The evaluation structure of the Mediaworks configurable processor system, implemented with Altera APEX 20K1500 FPGA.

One of the ways to improve performance is to use common small interface standards, such as PCMCIA cards and CF cards. The latest PDAs support these interface standards. Both advanced video and wireless processing systems can be developed using PCMCIA and CF circuit card standards. The biggest problem in providing excellent streaming video products for PDAs is that the above-mentioned interface standards cannot meet the high data bandwidth requirements of high-resolution video, while video data compression algorithms can partially or completely solve the problem. However, streaming video codecs, such as MPEG4, are developed for systems with relatively unlimited bandwidth, such as PC processing of several GHz, which cannot meet the quality, cost, power consumption, and performance requirements of a large number of wireless multimedia devices on the market. Therefore, the development of a video codec to accelerate a complex processor for streaming video has become a key issue.
This article introduces several methods to realize high-quality streaming video with PDA. MediaWorks built a configurable media processor, improved the design method based on programmable logic, and developed a complete set of optimized solutions. The use of a configurable media processor can also enable the software and hardware co-design process, which is critical to engineering production efficiency and can speed up the design and development of video codecs with different performance.
PDA devices have screens for viewing graphics and speakers and microphones for recovering captured audio, and generally do not integrate a camera for video capture. Considering the cost relationship, the most economical processors are used, and these processors usually do not support bidirectional streaming video. Therefore, additional processors and compression/decompression hardware and software are required. This article attempts to solve the problem of realizing MPEG4 video capture, transmission and playback in existing PDA devices.
In the design scheme, the first step is to increase the video capture capability, that is, to develop a PCMCIA-based VGA resolution camera with a sensor that can work at a sufficient frame rate and image size. Although most PDAs cannot display a complete VGA image, they can transmit the image to PC users via the Internet, and then they can view the entire image. It is easy to develop using the PCMCIA interface standard, and the Compcat Flash specification can also be used as a backup. The basic structure is to transmit all image data to the PDA through the PCMCIA bus, and use software to encode the output image and decode the input image.
The initial result is a frame rate of 7-12 frames per second, which is suitable for QCIF (176&TImes;144) images (images seen by local users and images being sent to local users). These results prove the feasibility of this concept. But we feel that to meet the requirements of delivering images to PC users, the image size should be larger and the frame rate should be faster to make viewing smoother. The next generation of design should involve structural changes to solve the above problems.
Because you want to add video functions to the PDA, you need to provide a PCMCIA/CF-based VGA camera. However, in order to increase the video performance of the system, the bottleneck of the PCMCIA bus and the computational performance limitations of the PDA must be resolved. The design achieves this goal by placing the video encoder on the camera side of the PCMCIA interface. According to the image sequence, the encoded video only needs less than one-tenth of the data rate of the uncompressed video stream. Encoding the video stream on the camera side of the interface bus allows larger video images to be transmitted to the PDA, and more Video frame.
Encoding requires some decoding capabilities, so the amount of calculation for video encoding is greater than decoding. So use configurable processing technology to develop an encoder for VGA camera. This new video architecture is also suitable for next-generation products that add wireless capabilities. Video decoding is still in the PDA's software.
With this new structure, you can get:
* CIF resolution images at 30 frames per second
* VGA images greater than 20 frames per second The current design method is to find the processor that best meets the task requirements, but using configurable processors such as Altera's Nios or Tensilica's Xtensa, the processor can be customized for specific tasks. The design using a configurable processor has the following steps: First, use the APEX 20KE FPGA development system or instruction set simulator to evaluate the initial software and hardware configuration, and determine the performance bottleneck by analyzing the results. The method of solving the biggest bottleneck with hardware, software or a combination of the two has been theoretically passed and is being continuously improved. The solution has been implemented (parameterized instructions, processor configuration changes, coprocessors, new systems, etc.), and the results are being evaluated. The results of the evaluation shall confirm the improvement in performance. Evaluate; propose a plan; verify the plan through further evaluations, and repeat until the hardware/software plan meets the performance requirements, as shown in Figure 1. After this process, there may be some unavoidable bottlenecks, but it will gradually reach the optimal point.
The configurable processor solution must meet the following conditions:
* The processor has a parameterized instruction set.
* The processor has changeable parts such as buffer size.
* The processor has an external or specific coprocessor.
* The processor can run under multi-processor conditions.
* A combination of the above conditions.
Using parameterized instruction processing, developers can use basic processor configuration to evaluate code and find bottlenecks. A common bottleneck is the lack of high-speed buffers. With a configurable processor, developers can roll back and reconfigure the processor, provide more instructions or data buffers, or combine instructions from 2-way to 3-way to 4-way. Another type of bottleneck is that too narrow data channels limit the coding efficiency of one or a large number of pixel blocks. With a configurable processor, the data channel can be widened, and the entire row of pixels can be processed at one time, thereby saving processor cycles. You can create specific instructions to take full advantage of this wider data channel. Taking MPEG4 operation as an example, the calculation of the sum of absolute difference (SAD) can be customized, so that the sum of 16 independent 8-bit pixels can be replaced by a 128-bit simultaneous sum instruction for all 16 pixel values.
It is also possible to customize instructions for the Discrete Cosine Transform (DCT), but it may be better to use a dedicated coprocessor. Using a dedicated coprocessor, the pipelined DCT coprocessor can do this kind of work. Software DCT can easily take 15% to 20% of the processor cycles. If the processor does not have the required bandwidth, you can replace the DCT software with hardware at a rate of 60 to 80 MHz.
The multi-processor design is similar to the use of dedicated coprocessors. Because each frame needs to be quantized or simulated, video encoding and decoding have serial characteristics. DCT or inverse DCT (iDCT) processes each frame sequentially, so a frame can be transferred from one processor to the next, and each processor performs a specific function. Therefore, the overall frame rate is the frame rate of the slowest processor. With this solution, the initial startup of the processor pipeline will be delayed, and the original encoding/decoding software needs to be redesigned for parallel operation.
The preliminary results of the above design scheme show that the processing cycle has been significantly improved, but further optimization is needed (distributed digital DCT, redesigned structure, reduction of intermediate storage, etc.).
The current results are summarized in Table 1 to Table 4.
Once the final design plan is determined, the plan can be transplanted to ASIC, or converted from FPGA design to Altera HardCopy devices, which can reduce costs.
This article briefly discusses how to re-equip existing PDAs with multimedia functions using a configurable media processor. The key is hardware/software co-design. When developing solutions to eliminate performance bottlenecks, designers need to make trade-offs between possible hardware and software solutions.

Expanded Ptfe Tape

Ptfe To Eptfe,Ptfe Seal Tape,Thread Seal Tape,Eptfe Ptfe Bearing

Cixi Congfeng Fluorine Plastic Co.,Ltd , https://www.cfptfeseal.com

Posted on