Fighting Confirmation Bias Online

By: Anthony Naddeo - (politics)

The problems

  1. People are not exposed to ideas they disagree with in non-hostile environments
  2. People do not know who to trust

One possible part of the solution

This is going to be a high level design review in public. Ever since I clicked on the Milo-Berkeley riot video on YouTube I’ve been fed an endless supply of partisan click bait. It has become trivial to maintain an enraged state today and I know that people I don’t agree with are being fed similar diets. I’d like to throw an idea out into the universe that I think could either help counter this or inspire different ideas that can counter these tendencies. One thing is for sure, social media and news outlets aren’t going to change the menu as long as competing for viewer’s attention span is the only way they have to compete.

The Web of Trust is a powerful concept 1 2. It mirrors our non-digital behavior in many ways. If you want to be a programmer but you don’t know where to start, you might start by talking to a person that you trust who is a programmer. In that person’s absence, you might talk to a person that you trust who knows a programmer. You would trust that programmer by proxy and value their advice greater than that of a complete stranger. The implication being that the person you know wouldn’t try to deceive you; a safe assumption more often than not in this scenario.

Using the idea of a trusted network, we can set up a simple system for combating the tendency towards confirmation bias in our social media platforms that we so readily seek and find today. The following is a high level design of such a system.

High level technical design The goal is to build and maintain a graph of

trust that we can navigate to surface opposing view points. The following sequence is the intended use case. This system is code named CounterView, hopefully for obvious reasons.

  1. User comes across an article in one of their streams of social media.
  2. The article is something that this user already agrees with.
  3. Instead of reading it, the user can query CounterView with a link to the article they already agree with.
  4. The article is parsed and classified as reporting on a set of one or more topics.
  5. The article is also placed into a single (possibly political) bucket based on its author, location and other metadata. Buckets include Left, Center-Left, Center, Center-Right, Right. Buckets just represent a spectrum and can be renamed for use outside of politics.
  6. Assuming the article was Center-Left, the web of trust would be traversed to find people who are largely considered to be Center-Right based on their opinion and the opinions of the others in the web of trust.
  7. Of those Center-Right candidates, their work which was classified as having those same sets of topics (with some confidence level) would be recommended to the User instead.

Below is a high level overview of what a sample request might look like. There is no magic involved. Most things would be hand curated. Part of the process of adding someone to the trust network would be giving them and their work some sort of a reasonable classification on the political (or any other binary-ish) spectrum. That classification serves only to determine what articles are different from each other, rather than the same.

This system depends on solving a small number of discreet problems:

  1. Programatically gathering all of the work of the individuals in the graph. At least headlines are needed.
  2. Programatically extracting topics or issues from pieces of media. Being able to tell if two pieces of media (articles, videos, blogs, etc.) are about the same thing.
  3. Manually assigning graph members to discreet buckets based on their opinion and the opinion of other members in the graph. Buckets include Left, Center-Left, Center, Center-Right, Right.

How much any individual user trusts this graph and these classifications then becomes a matter of how much they trust the graph members. To an extent, sub-graphs can be constructed by selecting a few nodes from the larger graph and using them as starting points.

Growing the graph

The graph starts with a single person. That person has to start by adding people who they trust in the public sphere. In the beginning, the graph will probably be lopsided, leaning heavily in one direction. If the entire graph consists only of a single political bucket then no recommendations can be made. If that is the case, then one of the trusted members has to be asked for references; people who they trust and do not agree with. If a reference can be found and added to the trust web, then they can be used to recruit additional members in a similar fashion.

Why not focus on machine learning?

A lot of people who have expected this to involve more machine learning. It can be a powerful tool but it isn’t without drawbacks. The goal is to expose people to new ideas that they disagree with in a way that they can trust and understand. It can be difficult to explain why someone received a suggestion from a neural net in such a way that they can understand and find credible.

An explanation that someone might get in a machine learning powered system might be, “People who often disagree with this article found these articles trust worthy”. Some follow up questions from that person might be, “why did they find these articles trust worthy?”, “what kind of people were they?”, “should I care if many people agree on something?”.

I think a better explanation for an article recommendation would be, “This article was written by A and is published on foo.com, which is considered a far left publication. You trust B who claims that C is a reputable author who is considered center right. Here are some of C’s articles on this topic”. I think this is a lot closer to how we function in our offline lives.

Another good reason to avoid machine learning for something like this is the maintenance. Machine learning at scale tends to require a higher skill floor. It also tends to require more data and more compute. The design in this post can be understood and implemented by undergraduate computer science students using free, stable, open source software.

That said, machine learning definitely has its place in this system. Manually extracting topics from every article that every person in the graph has ever written is daunting.

Conclusion

This idea is pretty rough but I feel that it is important to get it out there. Who ever reads this, feel free to iterate on it, give me feedback or just go steal it and make it work. Tools like this are going to be important for our long term sanity in our digital age.

Footnotes

  1. https://www.linux.com/learn/pgp-web-trust-core-concepts-behind-trusted-communication 

  2. https://wiki.debian.org/Keysigning