Wednesday, July 2, 2014

Vulnerabilities disclosure - how it's supposed to work

Note : there is a follow-up to this story

 In the lifetime of LZ4, I've received tons of feedbacks, feature requests, warming thanks, bug disclosures, often with their bugfix freely offered.

This is the essence of open source : through disseminate usage and reports, the little piece of software get exposed to more and more situations, becoming stronger, and reaching overtime a level a perfection that most closed-source implementations can only dream of, due to limited capabilities of internal testings only. The strength of the formula totally relies on such feedback, since it is impossible for a single organization to guess and test all possible situations.
It is an enlightening experience, and I can encourage every developer to follow a similar journey. It will open your eyes and perspective into a much larger linked world.

Among theses issues, there were more than a few security bugfixes. I can't tell how much I thank people for their welcomed contribution in this critical area. Disclosures were received by email, or using the LZ4 issue board. Typical contributors were modest professional developers, providing their piece of help for free, on top of a larger and larger edifice, and got a quick "thanks notice" in the version log, an honor they were not even requesting. This is this modesty and positive construction mindset which really touched me and drove me forward.

Such early security fixes deserve today high praise, way more than a recent "integer overflow hype" created to receive inflated media coverage. It's a sad state of affair, but much more important issues got fixed way more politely and securely, as a sincere desire to improve a situation. Overtime, such security bug became less and less common, and almost disappeared, basically proving the strength of the open source formula, an implementation becoming stronger after each correction.

When an implementation becomes widespread, the rules of security disclosure gradually evolve, because the number of systems and user potentially exposed become too large to game with. A good illustration of a proper reporting process is detailed on the CERT "Computer Emergency Response Team" website. The process is both transparent and clean. The security auditing firm would first start by contacting upstream developer team, ensure the bug is understood and a fix undertaken. It will issue a typical disclosure delay of 45 days, to ensure upstream organization get an incentive to solve the problem fast enough (without the delay, some organisations would opt for not doing anything at all !). After which delay, a public disclosure can be sent. Hype is optional, but is not forbidden either, as long as users get correctly protected beforehand, which is the whole point.

That's why, in retrospect, I've been so angry last time about the auditor code of conduct. I initially reacted at the overblown vulnerability description, and only realized later that it was first and foremost a question of disclosure policy. With just a little bit of communication, everything would have been fine, as usual.
Unfortunately it did not happened. The auditor barely left a footnote on the issue board to request a level raise on an item (which was accepted), and then simply lost contact, focusing instead on overselling a security risk for maximum media coverage. In doing so, he never looked back to ensure that a fix was ongoing or being deployed to protect users before disclosure.

This behavior is totally untypical from a respectable security firm behavior. In fact, it is in total contradiction with core values of security auditing. To willingly sacrifice public safety for some selfish media coverage feels just insane. I was thinking : "Damn, suppose he'd been right..."

Fortunately, the risk was not as large as advertised. And since then, I've been pushed into believing it was just a genuine mis-communication lapse, thanks to reassuring words from the auditor himself, acknowledging the communication problem and swearing to improve the situation for the future (incidentally linking his nicer words to Twitter for a positive image). Considering that previous vulnerability couldn't result in anything serious, and willing to grant the privilege of doubt, the story was closed with a more neutral-tone statement. I would then expect that, from now on, a normal communication level would be restored, future bugs being disclosed "as usual", that is starting with an issue notification, and a discussion.

Foolish assumption.

In total contradiction with his own logged commitment, donb doubled down earlier today, broadcasting a new vulnerability issue directly to the wild, without a single notification to upstream developer :

The new vulnerability could be correct this time. I have not been able to prove/disprove it myself, but have no reason to disbelieve it. Some specific hardware and debugging capabilities seem required to observe it though.

Apparently, the risk is valid for ARM platforms (maybe some specific versions, or a set of additional platform specific rules, the exact scope of which I don't know about). I have doubts that it is only hardware related, I believe it must be OS-driven instead, or a combination of both.
The key point is the ability of the 32-bits system to allocate memory blocks at high addresses, specifically beyond 0x80000000h. This situation never happened in earlier tests. Each 32-bits process was encapsulated by the OS into its own virtual address space, which maximum size is 2 GB, irrespective of the total amount of RAM available (this was tested on 4 GB machines). With no counter example at hand, it became an assumption. The assumption was key in the discussions assessing gravity level, and remained undisputed up to now. Today's new information is that this situation can in fact be different for some combinations of OS and hardware, the precise list of which is not clearly established.
Should you own a configuration able to generate such a condition, you're very welcomed to test the proposed fix on it. The quality of the final fix for this use case will depend on such tests.
The issue tracker is at this address :
A first quickfix is proposed there.
[Edit] The vulnerability existence can now be tested, using the new fuzzer test available at

In normal circumstances, a vulnerability disclosure is welcomed. For the open source movement, it translates into better and safer code. That's a view I totally embrace.
But obviously, everything depends on the way vulnerability is disclosed. Even a simple mail or a note on the issue board is a good first step. At the end of the day, the objective is to get the issue fixed and deployed first, before any user get exposed, to reduce the window of opportunity for malware spreaders. It's just plain common sense.

This latest disclosure does not share such goodwill elements. By selecting direct wide broadcasting without ever notifying the upstream developer about its finding, the security auditor did the exact opposite of his social mission, ensuring maximum exposition danger for all users and systems. Of course, this choice will create him a "hacker gangsta" (in his own words) reputation within his professional circle, which he believes is good for him. But that's questionable, can a paying company entrust its critical security vulnerabilities to a self-made security auditor with borderline business practices like this ?

As far as we are concerned, the goal of the game is to get a safer implementation of LZ4 available to the general public, trying nonetheless to make the window of opportunity as small as possible. In the longer run, the episode will serve as another reinforced stone, providing security benefit to the open sourced edifice. But in the short term, we suffer exposure.

The new status is as follows :
  • The vulnerability affects only 32-bits applications (64-bits are safe)
  • The new vulnerability affects systems allocating memory beyond address 0x80000000h (others were already safe)
This last point is very difficult to know. It seems Windows systems are safe for example, but that still leaves a lot of other systems to check. The new fuzzer tool is now designed to test the existence of this vulnerability, and check the efficiency of the last fix against this new exploit scenario.

You can get the new fuzzer tool and the proposed fix at :
The fix seems to provide good results so far, don't hesitate to test it on your target system, should it match the above conditions.

[Edit] : the fix is good to go to master, hello r119 !
[Edit 2] : since the second condition is relatively difficult to assess, the fix is recommended for any 32-bits application.

[Edit 3] : After further analysis, it seems the new overflow issue remain relatively difficult to exploit usefully. It has opened new fronts, but still require some favorable conditions, outside of attacker control, such as allocation at the very end of the available memory address range. Relatively large data blocks remain a necessity for a correspondingly good success perspective. Previously published conditions still apply to design an interesting attack scenario. With most LZ4 programs using small blocks, it makes overflow risk a rarity, if not provably impossible depending on allocation rules.
Still, with a fix available, updating your 32-bits applications to r119+ remains a recommended move.

[Edit 4] : End of Linux kernel risk assessment. This potential overflow has no useful attack scenario for now. It is nonetheless getting fixed, to protect future applications. Knowing current list of applications using LZ4 within the kernel, the only remaining attack scenario is a boot image modification. When such a scenario is possible, then you've got a lot more to worry about, a non-guaranteed potential overflow under cooperative conditions pales in comparison of a direct modification of the boot image, inserting typically some worm code.


[Edit] : Sight, I can only wish such thing never happen to you. An organized internet straw-man campaign has been launched (guess by who), inventing words I never said and even strongly disagree with. I feel compelled to put some records straight for the public :
OK, so where is the problem ?
  • Broadcasting a vulnerability to the wild without even providing a single notification to upstream organization cannot get close to the definition of ethical disclosure, under no possible metric
  • Conflating gravity levels and spreading meritless fear to the public to harvest some free ad is a despicable scare-mongering practice
The debate over Responsible Disclosure is not new. In factit is gaining strength, precisely because software becomes ubiquitous. Long gone are the day when a vulnerability would mostly put at risk some isolated computers primarily used to play games. Computing is now interconnected, and the backbone of our most critical services. With Internet of Things, it's going to be present into everyday devices, including medical equipment, surveillance systems, smartgrid probes, etc.

In Responsible Disclosure, there is Disclosure, which is a good thing. There is also Responsible. For CERT, it translates into calm vulnerability classification, a notification, and a fix delay. There is a huge difference between a notification into a public issue board, which remains nonetheless public, and a communication campaign designed to make a vulnerability known by a maximum number of people before a fix get a chance to get produced and deployed.

In Europe, law has selected its side, ruling in simple words that providing a manual to launch a cyber attack is about as good as providing a plan for a bomb. Of course, only the most adamant cases had to meet a judge, resulting in few convictions, mostly when the specific charges of "willful harm" were on the table. Currently, justice get involved when an offender explicitly targets a plaintiff. I believe that someday, it will simply no longer remain acceptable to consider public safety a dispensable collateral victim, freely exposed by the fire of an advertisement exercise or a petty personal revenge.