iXBT Labs - Computer Hardware in Detail

Platform

Video

Multimedia

Mobile

Other

XviD MPEG4 codec and Pentium 4, The Wrong Way To Optimize Software




The performance tests carried out some time ago for Intel's and AMD's desktop processors in video encoding, where the XviD showed very low results on the Pentium 4, left an unpleasant impression. Well, different codes can be executed differently on different architectures, but the results can't differ twice! The Athlon XP 3200+ and Pentium 4 3.2 GHz have a comparable potential, and if one turns out to be twice faster than the other it looks abnormal. Such situation is impossible if software is correctly written. There must be something wrong exactly with software.

We decided to study this problem, and the first suggestion that we had to check was that there was a problem with the source file intended for encoding. Video compression is a complicated process, and a combination of a certain source video series and a certain processor architecture can result in such failure. It wasn't difficult to check it: we took our standard test packet and simply replaced TEST.MPG with TEST.VOB in the script for the VirtualDubMod, and then compressed the other source file with the same XviD version with the codec parameters being the same:

As you can see, the situation doesn't change much: with the standard MPG file (MPEG2) the Pentium 4 3.2 GHz was twice slower than the Athlon XP 3200+, now their scores differ by 2.1 times. But it's not that important. Well, the problem is not in the source file.

The next suggestion was that there was a problem with the given XviD version. It was also easy to verify: we replaced the codec version in the same standard test technique, set the same parameters (where it was possible) and took measurements. In the tests we used two builds - a previous one from the same builder (Koepi build 24.06.2003) and the latest one from Nic - Nic's build 16.07.2003. The latter was compiled with the Intel C Compiler 7.1 which should have a positive effect on performance of Intel's processors. By the way, we couldn't find out whether Koepi and Nic were developers at XviD. Koepi said that he did some work for XviD, but his name wasn't mentioned in the list of developers. The situation with Nic is even more vague. But since they are key suppliers of binary files (especially Koepi, as his builds can often be found in codec packs), we decided that they had the right to represent XviD in our review. Let's take a look at the scores.

Nic's build is more loyal to the Pentium 4 indeed. On the contrary, Koepi's builds are much tougher to the Intel CPU: while the Athlon XP increases its speed on the new build, the Pentium 4 remains on the same level. But anyway, we have an impression that there's something wrong with Koepi's builds, and the problem concerns not only the code.

That is why we decided to look into one more situation: what if the optimization parameters are incorrectly selected? You might remember that the Microsoft Windows Media Encoder tried to detect the SSE support by checking the CPU maker (it considered that if it wasn't Intel, SSE couldn't be supported). Thankfully, one can manually tick off SIMD instructions in the XviD settings except the automatic detection. It supports MMX, 3DNow!, 3DNow! 2 (the developer probably means Extended 3DNow! of K7 based processors), SSE, Integer SSE (?) and SSE2. Since the Athlon XP entirely supports SSE and 3DNow! 2 we combined options of the 3DNow! and 3DNow! 2 under the name of 3DNow!, as well as Integer SSE and SSE under the name of SSE. First of all, let's test the latest build for XviD from Koepi (1.0 beta 2, 05.12.2003) with different manually selected optimizations for the Athlon XP 3200+.

It's clear that the MMX optimization makes the greatest effect, it's the determining factor for the Athlon XP. The 3DNow! optimization has a much weaker effect compared to the SSE, but even both of them yield to the MMX. What kind of optimization is it if the outdated MMX beats 3DNow! + SSE? It's strange... It proves that there is a problem with the optimized code. But it's only the beginning of the problem.

The MMX makes an effect, but only when all other optimizations are disabled! If we enable the MMX, for example, together with the SSE the performance of the Pentium 4 based system will considerably fall down! By 1.5 times! We can also see that the codec doesn't have any noticeable optimization for the SSE2: compare the columns named "No Optimization" and "SSE2", as well as "MMX" and "MMX+SSE2", "SSE" and "SSE+SSE2". Besides, the optimization for the SSE is good, it's just weaker than the MMX on the Athlon XP, and it looks more pronounced on the Pentium 4 compared to the rest. At least, on this processor the SSE is the only kind of SIMD which brings some effect. Well, performance of the Pentium 4 with the XviD 1.0 beta 2 (Koepi) codec is artificially decreased in case of the automatically configured parameters. The reduction reaches 1.6 times!

Let me show you one example: even if we suppose that performance of the Pentium 4 grows in proportion to its clock speed, it's necessary to raise it up to 5 GHz to reach the results obtained with the correctly selected optimization parameters! Now let's take a look at the same data from a different standpoint.

So, the subject in question is a SIMD optimization in the XviD 1.0 beta 2 Koepi build. The most effect for the Athlon XP is achieved with the oldest SIMD set - MMX (which at the same time kills the Pentium 4), the SSE is well realized (the gain is almost double on the Athlon XP compared to the situation when no optimizations are used, and it makes 1.9 times for the Pentium 4), the SSE2 is also announced, but we noticed no traces. The 3DNow! helps the Athlon XP catch up with the Pentium 4 without any optimizations! Can it be just a peculiarity of the given version of the XviD? To find it out we again resorted to Koepi build 24.06.2003 and Nic's build 16.07.2003. Let's see if locking of the MMX optimization in other codec versions produces the same magical effect on the Pentium 4 based systems.

Well, such a crippled version is probably the exception. Moreover, the MMX optimization that so awfully affects the Pentium 4 MMX can be considered an exclusive feature of the XviD 1.0 beta 2 from Koepi. However, the two bottom lines (Koepi build 24.06.2003) indicate that such trend started half a year ago. At the same time, Nic's build 16.07.2003 shows normal results: if MMX is disabled - performance worsens, if enabled - performance betters.

Summary

  • An additional instruction set do not guarantee any performance gain. With skillful hands :) one and the same optimization method can become an excellent tool for achieving an inverse effect. Besides, it can happen only on certain processor architectures (this sproblem is more difficult to trace). 
  • The automatic optimization parameters set by the XviD 1.0 beta 2 (Koepi build) make a killing effect on the Pentium 4 performance. I do recommend all users who have systems based on this CPU set optimization parameters manually and disable MMX.
  • At the same time, the latest XviD build from Koepi delivers the best compression speed even on Pentium 4 based systems, but only with the manually selected parameters!
  • The terrible defeat of the Pentium 4 in XviD compression revealed in the recent tests is actually not that awful and makes 32% instead of 100%. It's not that little, but not fatal either. 
  • The fight of megahertz and clumsy hands of programmers (or builders) always ends up with programmer's victory. End-users end up losing because they can't afford spending a couple of days for searching for bugs in multiple beta versions.

Stanislav Garmatyuk (nawhi@ixbt.com
 

Write a comment below. No registration needed!


Article navigation:



blog comments powered by Disqus

  Most Popular Reviews More    RSS  

AMD Phenom II X4 955, Phenom II X4 960T, Phenom II X6 1075T, and Intel Pentium G2120, Core i3-3220, Core i5-3330 Processors

Comparing old, cheap solutions from AMD with new, budget offerings from Intel.
February 1, 2013 · Processor Roundups

Inno3D GeForce GTX 670 iChill, Inno3D GeForce GTX 660 Ti Graphics Cards

A couple of mid-range adapters with original cooling systems.
January 30, 2013 · Video cards: NVIDIA GPUs

Creative Sound Blaster X-Fi Surround 5.1

An external X-Fi solution in tests.
September 9, 2008 · Sound Cards

AMD FX-8350 Processor

The first worthwhile Piledriver CPU.
September 11, 2012 · Processors: AMD

Consumed Power, Energy Consumption: Ivy Bridge vs. Sandy Bridge

Trying out the new method.
September 18, 2012 · Processors: Intel
  Latest Reviews More    RSS  

i3DSpeed, September 2013

Retested all graphics cards with the new drivers.
Oct 18, 2013 · 3Digests

i3DSpeed, August 2013

Added new benchmarks: BioShock Infinite and Metro: Last Light.
Sep 06, 2013 · 3Digests

i3DSpeed, July 2013

Added the test results of NVIDIA GeForce GTX 760 and AMD Radeon HD 7730.
Aug 05, 2013 · 3Digests

Gainward GeForce GTX 650 Ti BOOST 2GB Golden Sample Graphics Card

An excellent hybrid of GeForce GTX 650 Ti and GeForce GTX 660.
Jun 24, 2013 · Video cards: NVIDIA GPUs

i3DSpeed, May 2013

Added the test results of NVIDIA GeForce GTX 770/780.
Jun 03, 2013 · 3Digests
  Latest News More    RSS  

Platform  ·  Video  ·  Multimedia  ·  Mobile  ·  Other  ||  About us & Privacy policy  ·  Twitter  ·  Facebook


Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved.