Summary of “it’s time to make it fair”

Word embedding, a popular algorithm used to process and analyse large amounts of natural-language data, characterizes European American names as pleasant and African American ones as unpleasant.
In medicine, machine-learning predictors can be particularly vulnerable to biased training sets, because medical data are especially costly to produce and label.
Computer scientists evaluate algorithms on ‘test’ data sets, but usually these are random sub-samples of the original training set and so are likely to contain the same biases.
Every training data set should be accompanied by information on how the data were collected and annotated.
As much as possible, data curators should provide the precise definition of descriptors tied to the data.
Lastly, computer scientists should strive to develop algorithms that are more robust to human biases in the data.
A complementary approach is to use machine learning itself to identify and quantify bias in algorithms and data.
To address these questions and evaluate the broader impact of training data and algorithms, machine-learning researchers must engage with social scientists, and experts in the humanities, gender, medicine, the environment and law.

The orginal article.

Summary of “Health Insurers Are Vacuuming Up Details About You”

With little public scrutiny, the health insurance industry has joined forces with data brokers to vacuum up personal details about hundreds of millions of Americans, including, odds are, many readers of this story.
Up front, the prime real estate belonged to the big guns in health data: The booths of Optum, IBM Watson Health and LexisNexis stretched toward the ceiling, with flat screen monitors and some comfy seating.
Patient advocates are skeptical health insurers have altruistic designs on people’s personal information.
The Affordable Care Act prohibits insurers from denying people coverage based on pre-existing health conditions or charging sick people more for individual or small group plans.
The Trump administration is promoting short-term health plans, which do allow insurers to deny coverage to sick patients.
At the IBM Watson Health booth, Kevin Ruane, a senior consulting scientist, told me that the company surveys 80,000 Americans a year to assess lifestyle, attitudes and behaviors that could relate to health care.
Actuaries calculate health care risks and help set the price of premiums for insurers.
She points to a study by the analytics company SAS, which worked in 2012 with an unnamed major health insurance company to predict a person’s health care costs using 1,500 data elements, including the investments and types of cars people owned.

The orginal article.

Summary of “On the 10th anniversary of the App store, it’s time to delete most of your apps”

It’s hard to believe that just a decade ago, the App Store icon showed up on the iPhone.
Here on the 10th anniversary of the App Store, this is a perfect opportunity to head into your smartphone and clean out some of those ill-advised downloads from the past decade.
A report last year found that the average person launches roughly nine apps per day and interacts with roughly 30 apps over the course of a month.
Go to settings > General > iPhone Storage and it will give you an option called “Offload Unused Apps,” which automatically deletes apps you don’t use regularly, but saves the documents and data that go with them.
As a true app slob, this option would save me more than 23 GB. Both Apple and Android users will soon have a much better idea of which apps they actually use thanks to Apple and Google’s respective digital wellness initiatives.
If you don’t want to wait for those first-party solutions to roll out, you can try an app like Moment, which is meant to prevent you from spending too much time in apps but could provide similar insight about apps you’ve ghosted.
Photo, video, and audio editing apps are notorious for this kind of bloat because they store versions of the original media inside them.
Clicking into a specific app will tell you how much storage is dedicated to the app itself, and how much is dedicated to the documents and data it has accumulated.

The orginal article.

Summary of “‘Data is a fingerprint’: why you aren’t as anonymous as you think online”

In August 2016, the Australian government released an “Anonymised” data set comprising the medical billing records, including every prescription and surgery, of 2.9 million people.
“It’s convenient to pretend it’s hard to re-identify people, but it’s easy. The kinds of things we did are the kinds of things that any first year data science student could do,” said Vanessa Teague, one of the University of Melbourne researchers to reveal the flaws in the open health data.
“The point is that data that may look anonymous is not necessarily anonymous,” she said in testimony to a Department of Homeland Security privacy committee.
More recently, Yves-Alexandre de Montjoye, a computational privacy researcher, showed how the vast majority of the population can be identified from the behavioural patterns revealed by location data from mobile phones.
“Location data is a fingerprint. It’s a piece of information that’s likely to exist across a broad range of data sets and could potentially be used as a global identifier,” de Montjoye said.
Even if location data doesn’t reveal an individual’s identity, it can still put groups of people at risk, she explained.
Montjoye and others have shown time and time again that it’s simply not possible to anonymise unit record level data – data relating to individuals – no matter how stripped down that data is.
“There are firms that specialise in combining data about us from different sources to create virtual dossiers and applying data mining to influence us in various ways.”

The orginal article.

Summary of “The AI revolution has spawned a new chips arms race”

Google has a special AI chip for neural networks call the Tensor Processing Unit, or TPU, which is available for AI apps on the Google Cloud Platform.
IBM is developing specific AI processor, and the company also licensed NVLink from Nvidia for high-speed data throughput specific to AI and ML. Even non-traditional tech companies like Tesla want in on this area, with CEO Elon Musk acknowledging last year that former AMD and Apple chip engineer Jim Keller would be building hardware for the car company.
Why do we need more chips now, and so many different ones at that?
While x86 currently remains a dominant chip architecture for computing, it’s too general purpose for a highly specialized task like AI, says Addison Snell, CEO of Intersect360 Research, which covers HPC and AI issues.
The actual task of processing AI is a very different process from standard computing or GPU processing, hence the perceived need for specialized chips.
“Chips on the edge won’t compete with chips for the data center,” he says.
“Data center chips like Xeon have to have high performance capabilities for that kind of AI, which is different for AI in smartphones. There you have to get down below one watt. So the question is, ‘Where is not good enough so you need an accessory chip?'”.
A desire for more specialization and increased energy efficiency isn’t the whole reason these newer AI chips exist, of course.

The orginal article.

Summary of “Gmail app developers have been reading your emails”

Third-party app developers can read the emails of millions of Gmail users, a report from The Wall Street Journal highlighted today.
Gmail’s access settings allows data companies and app developers to see people’s emails and view private details, including recipient addresses, time stamps, and entire messages.
While those apps do need to receive user consent, the consent form isn’t exactly clear that it would allow humans – and not just computers – to read your emails.
Google employees may also read emails but only in “Very specific cases where you ask us to and give consent, or where we need to for security purposes, such as investigating a bug or abuse,” the company stated to the WSJ. Still, it’s clear that there are a lot of apps with this access, from Salesforce and Microsoft Office to lesser known email apps.
If you’ve ever seen a request like the one below when entering your Gmail account into an app, it’s possible you’ve given the app permission to read your emails.
As WSJ reports, other email services besides Gmail provide third-party apps similar access, so it isn’t just Google that may have these issues.
Both Return Path’s and Edison Software’s privacy policies mention that the companies will monitor emails.
While there’s no evidence of data misuse, being able to read private emails seems problematic While there’s no evidence that third-party Gmail add-on developers have misused data, just being able to view and read private emails seems like crossing a privacy boundary.

The orginal article.

Summary of “The Privilege of Knowledge”

Finance bloggers, myself included, fall victim to the privilege of knowledge as well.
People didn’t have the documented market history and technological capabilities we have today, so why should we have expected them to “Buy and hold” back then? If anything, their history was riddled with banking panics and far more instability, so I can’t blame them.
From 1871-1940, the U.S. stock market grew at a rate of 6.8% a year after adjusting for dividends and inflation.
Remember, less than 4% of stocks have accounted for the all of the wealth creation in the U.S. stock market since 1926.
Many of them remembered the market trauma of the 1930s as highlighted in the incredible book The Great Depression: A Diary by Benjamin Roth.
You might be thinking, why does any of this matter to me? Because though our investment knowledge is a privilege, it can also be a curse.
If the other participants have learned things from market history such as “The market goes up in the long run” or “Buy the dip”, then your job as an investor didn’t get any easier though you have more knowledge.
If everyone knows to buy the dip, then they may use that knowledge to stay invested longer, possibly irrationally propping up markets.

The orginal article.

Summary of “Why are all my weather apps different?”

The first consisted of the rain and thunder forecast for Bournemouth by the BBC weather app on the Saturday spring bank holiday.
Is our ability to predict temperature, precipitation and wind speed improving? If so, how come forecasts can vary so widely depending on which smartphone apps we use? How long have human meteorologists got before supercomputers and artificial intelligence make them redundant? And when can we expect 100% accurate forecasts?
The foundation of modern weather forecasting involves gathering huge amounts of data on the state of the atmosphere and Earth’s surface, such as temperature, humidity and wind conditions.
The extra number-crunching firepower also enables “Ensemble forecasting”, whereby forecast models are run multiple times using slightly different starting data to explore the probabilities of various outcomes.
What is most important – temperature, rain or wind conditions? Is average overall error most useful, or how often a prediction meets reality? “There are many, many ways to measure forecasting accuracy,” says Eric Floehr, founder of ForecastWatch, a US company that analyses the performance of weather providers.
A ForecastWatch report published last year compared the accuracy of six leading global forecast providers – AccuWeather, the Weather Channel, Weather Underground, Foreca, Intellicast and Dark Sky.
They use different algorithms based on different forecast models with different levels of detail.
In his 2012 book The Signal and the Noise, US statistician Nate Silver highlighted how plotting forecasters’ rain predictions against actual weather showed some consistently erred on the pessimistic side, especially at lower and higher probabilities of rain.

The orginal article.

Summary of “Google Is Building a City of the Future in Toronto. Would Anyone Want to Live There?”

Cut off from gleaming downtown Toronto by the Gardiner Expressway, the city trails off into a dusty landscape of rock-strewn parking lots and heaps of construction materials.
Google is not the first company to try reimagining a city.
Cities themselves have more money and energy than ever; rather than building from scratch, like Disney did, modern smart-city builders want to harness the energy and dynamism of existing cities.
The deal hasn’t exactly been a victory for transparency; Waterfront Toronto has declined to make the exact terms of its deal with Sidewalk public, so no one on the outside knows exactly what the city has promised Google, or vice versa.
“It’s horrible-the antithesis of privacy. They use sensors to identify everybody and track their movements.” That city in the United Arab Emirates set out in 2014 to become what it called the world’s smartest city.
Says Cavoukian, “It’s not going to be a smart city of surveillance. It’s going to be a smart city of privacy, and that will be a first.”
“Toronto got excited, people in the city are demanding this new city, and the city has, a little bit, lost control of the conversation,” says Simone Brody, executive director of Bloomberg Philanthropies’ What Works Cities.
“This is really a grand experiment, in many respects, that is going to teach not just Toronto but really cities all across the world what is the future city going to look like,” says Bruce Katz, the author and former Brookings Institution official.

The orginal article.

Summary of “Let’s make private data into a public good”

All are designed to maximize the advantages of sticking with Google: if you don’t have a Gmail address, you can’t use Google Hangouts.
The bulk of Google’s profits come from selling advertising space and users’ data to firms.
Let’s not forget that a large part of the technology and necessary data was created by all of us.
The low tax rates that technology companies are typically paying on these large rewards are also perverse, given that their success was built on technologies funded and developed by high-risk public investments: if anything, companies that owe their fortunes to taxpayer-funded investment should be repaying the taxpayer, not seeking tax breaks.
Measuring the value of a company like Google or Facebook by the number of ads it sells is consistent with standard neoclassical economics, which interprets any market-based transaction as signaling the production of some kind of output-in other words, no matter what the thing is, as long as a price is received, it must be valuable.
There is indeed no reason why the public’s data should not be owned by a public repository that sells the data to the tech giants, rather than vice versa.
The key issue here is not just sending a portion of the profits from data back to citizens but also allowing them to shape the digital economy in a way that satisfies public needs.
Mariana Mazzucato is a professor in the economics of innovation and public value at University College London, where she directs the Institute for Innovation and Public Purpose.

The orginal article.