The new Cornell certification helps create an ethical workplace in the data science of the future

We increasingly believe in algorithms, whether applying for a mortgage, a new job; or making personal health decisions. But what about a security system that uses face recognition and excludes a 55-year-old office caregiver from her night shift? Or are groups of people automatically cut from photos on social media? These are the unintended and often unjust consequences of data science tools being amplified among millions of users. They can also be very preventable.

This is a lesson that lawyer and epidemiologist M. Elizabeth Karns incorporates into every subject of data science and statistics she teaches in the Department of Statistics and Data Science. Her students will decide how to use data in the future, and while poor business decision-making is not new, Karns says the accelerated and aggregated impact of today’s data science applications is so dangerous: an individual, team, or even a company’s decision can affect the lives of millions of people. Moreover, the torrent of new technologies is moving faster than our regulatory systems, leaving a gap in responsibility. Even scientists themselves often do not know what exactly is going on within their algorithms.

“This little magic box [the algorithm] it determines our life choices, often without any transparency, proper procedure or appeal, ”says Carns. “That is why ethics is so important. We do not have to further marginalize certain groups and individuals should not worry about their security due to poorly and unethically designed data applications. ”

To combat the current approach to ‘wait and see’ algorithms, Karns has partnered with eCornell to launch a new online certification program in data ethics to provide practitioners with tools to embed ethics at every stage of the project and in the data science workplace. The program includes four two-week courses that offer data scientists or anyone managing data projects a structured “break” to consider the ethical implications of their work. Karns begins by reviewing data science issues at the macro level of equity, justice, security, and privacy, then shifts the focus to individual choices.

These choices, Carns says, are rooted in an ethic of virtues — personal values ​​or virtues that drive our behavior. The certification program guides participants to identify and clarify their virtues, and then offers low-stakeholder mechanisms to address ethical issues when those virtues are not consistent with the virtues of the project, team, or data science organization.

“With the ethics of virtue, we can identify what’s going on that makes us uncomfortable,” Karns says. “Is it a process? Personalities? How to collect data? Then we use our moral imagination to play on future ethical considerations and consider alternatives. ”

These alternatives do not necessarily require compromises, as best ethical practices – which focus on harm reduction – are also excellent business practices. Documentation is a valuable tool, as well as reference to corporate values, risk management and reputation. That’s why the new Data Ethics certification program isn’t just for practitioners; Karns says data scientists ’managers often don’t understand the types of demands they make, or the choices and risks they carry. He hopes that the increased availability of data ethics education through online courses will be a significant step towards shaping the future of data ethics – where managers expect discussions on potential ethical issues and incorporate ethical reflection at every step of the development process.

“The ability to recognize and mitigate the damage that results from our actions is key to building applications that are fair, equitable and secure,” says Carnes. “Ethical thinking is key training, and we are currently at a particularly opportune time to provide practitioners with the language and tools they need to improve the performance of data science and our world.”

Sarah Thompson is a writer for eCornell.

Leave a Comment