A few days ago, I was doing some Typescript work and went to search for a specific NPM package that I needed the docs for. I generally use DuckDuckGo these days, so I popped in my search and got back some results. These results were a bit odd:
Having made a living off of writing C for a few years, I don’t like it when we get random looking byte strings in outputs. We’ve all seen the likes of (Heart|Cloud)Bleed
and know that those random byte strings could be anything. Searching further, it’s not just the package I was searching for that has this issue (https://archive.is/gURQv, archived from the original):
Being the responsible engineer that I am, I reached out to DuckDuckGo Engineers who confirmed that this is actually coming from the third-party API that they use to interact with Bing, so it’s not a DuckDuckGo issue. So, I decided to dig a big deeper and sign up for an Azure account. Sure enough, the Bing API is spitting back these weird values:
curl "https://api.bing.microsoft.com/v7.0/search?q=site%3Anpmjs.com+%C3%A2%C2%82%C2%AC%C3%83%C2%99&mkt=en-US" -H "Ocp-Apim-Subscription-Key: $SUB_KEY" | jq '.webPages.value[].snippet' | less
If we look at Bing itself, the snippets on those packages are correct, so this doesn’t seem like an indexing issue, but an issue with the API. This leads me to a question: Is the Bing API leaking memory into searches? I reached out to Microsoft, but because I’m too lazy to investigate whether this is actually sensitive data, they don’t care. file
tells me that some of these are chunks of public keys (but file
is notoriously wrong about random bytes, so take that with a block of salt), which seems worrying, but not actively exploitable. Maybe folks would like to have a look around? There’s probably a bounty in there somewhere for any interesting discoveries.