Notes On "Global Guidelines: Ethics in Learning Analytics"

March 13, 2019

Notes On "Global Guidelines: Ethics in Learning Analytics"

In my post referencing the ICDE's Global Guidelines: Ethics in Learning Analytics yesterday I said:

This document summarizes the considerations of an ICDE working group on learning analytics. For the most part it is a clearly-written and even-handed treatment of a number of difficult subjects, though it would appear to be relentlessly focused on the institutional perspective (not surprising given its intended audience of educational policy makers and regulators as well as middle and senior management in higher education institutions. The presumption throughout is that a global ethic in learning analytics is needed, can be known, and comprehends the issues in this document. I'm not so sure, but that's a discussion that cannot be attempted in this short post.

In this article I expand on and substantiate the comments made in my short post.

A note on quotation: these are notes. I draw liberally from the text of the ICDE document. Only the content in the sections titled 'Comment' is my own. All other content in this article is either directly quoted from, or paraphrased from, the original, and only reorganized for clarity and exposition.

Definition and purpose of the use of Learning Analytics

Where is data drawn from?

All from (Sharples et al,. 2014):

behavioural data from online learning systems (discussion forums, activity completion, assessments)

functional data extracted from registration systems and progress reports

data shared by students as part of their daily social and study lives

How is Data Analyzed?

"predictive models (which suggest potential completion rates, for example)

social network analyses (which examine possible relationships between networks of individuals and groups)

relationship mining (which analyses links between sets of data patterns such as student success rates),

dashboards(data visualisation which provides a mean of delivering feedback to educators and learners).

The range of analytic practices along the axes of Value and difficulty has been set out helpfully in graphic form (Davis2013)."

What is it Used For?

to improve the chances of student success (Gasevic, Dawson & George Siemens, 2015)

to build better pedagogies, empower students to take an active part in their learning, target at-risk student populations, and assess factors affecting completion and student success.’ (NMC 2011)

to inform pedagogy, allocate resources and inform institutional strategy (Rienties, Boroowa, Cross, Kubiak, Mayles, & Murphy, 2016)."

Comment

Some key questions are elided in this initial summary:

What counts as an acceptable source of data? What data must be included in learning analytics?

What data must be excluded? This is mentioned briefly in section 6.1 (local data protection legislation, plus "the institution may also want to give consideration to whether any data categories are irrelevant, or sufficiently sensitive to warrant exclusion") but not discussed from an ethical perspective.

As an example of the latter question, we might well question whether it is ethical to question students about their race, require genetic samples, collect rumour and innuendo, collect data obtained by force or by torture, etc.

What counts as an acceptable method of analyzing the data? Are some methods of analysis (regression toward the mean, for example) unethical? Or is the method used to analyze the data beyond the scope of ethics in learning analytics?

Are some uses to which the data are put unethical. For example, if learning analytics data is used to support differential pricing (for example, charge students more likely to fail more money for supplemental tutoring)?

Data ownership and control

Is Data Property?

"The emphasis on ‘data as property’ overestimates individuals as autonomous and rational agents (e.g., Lazaro & Le Métayer2015)."

“student data is not something separate from students’ identities, their histories, their beings. ... data is an integral,albeit informational part of students’ being. Data is therefore not something a student owns but rather is. Students do not own their data but are constituted by their data” (Prinsloo2017).

"GDPR sets out principles establishing ways in which data may be accessed, stored and used, and defines as special categories personal and sensitive data."

Ownership of Data

Institutional Ownership Model

In a learning analytics context, the presumption is often that data collected is owned by the institution

students themselves might be expected to take another view.

Institutional Stewardship Model (and Exploitation)

the institution does not own the student data that it holds, but has temporary stewardship.

the institution may store datasets under certain conditions and for specified periods, but within a higher education context the issue of ownership could remain open.

In this situation, the institution would be able to collect, analyse and apply from the learning analytics outputs, are theirs to exploit for unrelated gain.

Student Rights that "Might be Argued"

institutions should grant students some input to determine which data can be collected, how that data can be used, who is able to access it,and for what purposes (Prinsloo, 2017)

institutions should grant students the ability to correct and/or add context to their raw data,

institutions should grant students the ability to review and make a case for choices which appear to be limited as a result of a learning analytics application

Issues Around Third Party Sharing

"might typically include sharing student data which service providers for marketing purposes,for example"

"the third party would typically be bound by the data protection rules which apply to the institution" (no reference provided)

Comment

Some key observations here:

At what point does data come into existence, and at what point does it become property? For example - a student clicks on a link, and then this activity is logged, and then the activity record is stored in an LRS. What is the 'property' here - the action, the log, or the activity record? This question is not examined at all.

This section at no point takes seriously the idea that the student might be the owner of the data. At best, students are offered limited rights, and the question of ownership might "remain open". This is especially astonishing given some of the data being considered (eg. "data shared by students as part of their daily social and study lives").

The case for the student is made in a similarly weak fashion with respect to what are arguably fundamental rights: to correct data, to limit the impact of incorrectly analyzed data, etc.

Third party sharing (or, more accurately, sales of data to third parties) is assumed as acceptable by default, with discussion only concerning limitations on the rights of third parties.

This is a significantly flawed section that basically assumes, and then rationalizes, a maximalist set of right for the institution, with few and limited rights for all other parties.

Transparency

What is Transparency?

making clear to students and to other stakeholders the purpose of learning analytics

relates primarily to how student data is collected, analysed and used to shape students’ potential learning journeys

also includes making clear what data is collected (and what is not) and any assumptions made about that data

Arguments for Transparency

brings with it an opportunity to engage with stakeholders to gain greater insight and involvement

facilitates both an opportunity to amend or correct the dataset (or the interpretations gained from it) and to add to institutional understanding of relevant factors which impact on student success

moral duty of the institution to act in a student’s best interests and perhaps advising alternative study paths

Arguments Against

Making students and stakeholders more aware of the uses of data may pose additional challenges

it is not always possible to be completely transparent

models built around regression approaches for example can be difficult to understand and interrogate

not always clear why one student may be identified as being potentially more vulnerable than another

not always in the best interests of students to communicate a predicted poor outcome

Comment

The term 'transparency' is very narrowly defined here and discussed only with respect to how data is used "to shape students’ potential learning journeys", and not with respect to the many other potential uses of student data, including some discussed above.

There is no discussion of whether an institution has an ethical obligation to transparency generally, and in particular, whether an institution has an obligation to reveal its collection methods, analytical processes and tools, models, and interpretation.

Arguments in favour of transparency generally centre only around the idea that they might enable students to contribute more data,

Arguments against transparency are all expressed as modalities, ie., that there is some sort of risk, without considering the extent or impact of the modality.

In general, this section operates from the presumption that non-transparency is the ethical choice, and that openness needs to be specially justified in all cases. I would contend that the opposite is true, that the institution has an ethical obligation at the outset to be transparency, and that non-transparency would be justified only in certain circumstances.

Accessibility of data

Overview

the determination of who has access to raw and analysed data

the ability of students to access and correct their own data

making clear which data might typically be included within a learning analytics application, and which might always be assumed to be out of scope

various categories of staff will have sight of some categories of raw student data as a normal aspect of the staff role, depending on their permissions. Where data categories are not required as part of that role, however, data would typically not be made available

Comment

The passive phrasing of this section ('typically' this, 'typically' that) is problematic and reads like a rationalization of existing practice.

There is no consideration is given to whether access may be granted or withheld to people based on the type of data, while in fact some data may be much more sensitive than other data

There is no consideration of whether access should be granted or restricted based on the possibility of harm being caused or prevented

Additionally, there is no discussion in this section of the responsibilities created by access to data, for example, the responsibility to access data securely

The section reads as though it is the right and purview of the institution to grant access as it wishes, with no real constraints on this right, and no instance in which anyone else (such as students) may have equally weighted rights to access to the data.

Validity and reliability of data

Datasets

ensure that data collected and analysed is both accurate and representative of the issue being measured

Datasets should be kept current as far as is possible, with opportunities for students and other stakeholders to refresh and replace existing data

Proxy Measures

Proxy measures should be used with caution

consider what the institution is trying to measure and to investigate how it might best be represented (rather than looking at available data first and figuring how it might be applied).

Statistical Analysis

ensure that data sets are complete and sufficient to enable robust calculations to be made

models used to analyse, interpret and communicate learning analytics to stakeholders (support staff, advisers,faculties, students) should be sound, free from algorithmic bias; transparent where possible and clearly understood by the end users

Comment

This section could be significantly deeper.

There are existing standards to ensure databases are reliable, for example, "single source of truth", and "data immutability". These should be referenced in any discussion of learning analytics.

There are no 'proxy measures'. There is data plus a model. If it is not what was measured, it is not data, it is an analysis of the data.

A broader and more comprehensive approach to ethics in statistical analysis should be considered, for example as described here.

Given the broad-based concerns about the use of poorly-sourced data in an unclear or even unethical fashion, especially given the slim discussion of transparency above, this section poses significant cause for concern.

Ultimately, the question has to be asked, is there an ethical obligation to perform learning analytics in a rigorous and scientifically valid manner, or can institutions and their contractors use whatever mechanism they feel is appropriate?

This question is of additional approach in the domain of education, training and development because these are not optional or recreational uses of data. It's not Netflix nor even the Facebook algorithm. Getting analytics right or wrong will impact on a person's future. Hence, professional practice ought to be at the forefront of ethics in this discipline.

Institutional responsibility and obligation to act

Arguments for an obligation to act

Example: having observed students not submitting summative assignments or having calculated the probabilities of module completion, is the institution obliged to act on what it has identified?

Arguments against an obligation

Often resources are constrained,and in distance learning institutions in particular where one-to-one conversations are less easy, it is not easy to reach all students who may have been identified as likely to benefit from a support intervention of some type

Institutional Policies

policy for identifying where support resource is focused, for example:

on the group identified as most potentially vulnerable;
toward students on high population core modules;
at students with particular characteristics (for example, those with known disabilities, etc.).

decision-making process should be transparent and clearly understood by all stakeholders

Comment

There is a longstanding principle in ethics of 'ought implies can' which forms the basis of the commentary in this section

the authors are clearly making the point that institutional obligation only exists insofar as it is within the institution's means

The question remains, however, as to whether the institution should offer services where it would, as a matter of (say) 'duty of care', incur obligations it knows it cannot satisfy

for example, would a doctor start an operation without knowing it could be completed safely?
Should a university offer a course knowing that it cannot sufficiently support all the students in the course

There is moreover the question of whether institutions have certain obligations only toward some of its students (those identified in the discussion of policies) or whether these extend to all, and if not to all, then on what basis these distinctions are made

Although not stated here, presumably the recommendation here is that transparency is subject to the same caveats as those cases in the 'Transparency' section above.

Overall, the discussion of institutional obligations as a matter only of institutional capacity is demonstrably insufficient. The offering of certain services creates obligations such that, if the obligations cannot be met, the service should not be offered. This section should consider such cases.

Communications

Communications to Students

Contact based on basic tracking of students is potentially less contentious

that triggered by predictive analytics needs a great deal more consideration (because) predictions are only a probability generated by a computer

communications with students are perhaps most effective if couched in general support terms rather than in probabilistic terms

Communications to Staff

whilst predictive analytics are useful in proactively alerting tutors or support staff to issues before they may arise, it is also important to seek additional context.

Regular communications to staff should help ensure that they understand the approach:

the underlying values linked to the institution’s mission and strategy;
the anticipated benefits for students;
the limitations of data and its interpretation;
guidelines for ethical practice.

Comment

The discussion in this section appears to be focused on rationalizing why communications might be abridged, or need explaining.

Unfortunately some significant issues with respect to the ethics of communication related to learning analytics are missing from this discussion:

Is there a general obligation to communicate the results of learning analytics to students and staff?

Should communications be complete and truthful?

There seems to be a suggestion in the discussion that untruthful, or less than fully truthful, communications should be the norm (presumably because the feeling is students would not understand truthful communications)?

Should communications of results include interpretations of those results, and on what basis?

Should the need for, and the nature of, communications be informed by external factors (such as the institution's interests, its mission and goals, or its legal liability)?

Cultural values

Multicultural contexts

understanding and interpreting data are necessarily more complex

(For example) A measure of participation or engagement may differ in different contexts
(For example) measures established as being correlated with successful or unsuccessful outcomes are likely to differ in different geographies and cultures.

Considerations Required

not all institutions will have the capacity or resource to develop in-house analytics tools developed on knowledge of its own students,

care should be taken if purchasing analytics packages from developers to ensure that the approach is:

fit for purpose
can be adapted if appropriate with local data and with local constraints in mind.

Comment

Is there a default culture from which accommodations for others are made in the form of 'adapting' analytics? Or should ethics begin with he presumption that there is no default culture.

There is a distinction to be drawn between 'data' (which would presumably be the same across cultures) and 'model' (ie., interpretations of data from particular cultural perspectives)

Therefore: what culturally-sensitive elements is it ethical to include in a model, and what culturally-sensitive elements (if any) are unethical for inclusion?

for example, elements that reference language spoken may be ethical
for example, elements that reference skin colour may be unethical

Cultural and local considerations extend beyond understanding and interpreting data

for example, in some cultures, collecting certain data may be unethical
for example, in some cultures, allowing some people to see certain data may be unethical

Ultimately, for each of the sections discussed in this report, cultural considerations may come into play. It seems wrong, therefore, to place 'culture' in a small and separate section

Consent

Presumption of the Requirement for Consent

this principle should be built around a minimum of informed consent (that is,transparency before registration).

(From above) the lack of clarity around who owns the data muddies principles of meaningful consent

Factors impacting Consent

Ideally, consent should not be considered in simple binary terms, but presented to students as a menu of options which depend on:

the purpose of the collection analysis and use of their data,
the disciplinary module or context,
the variety of possible data that can be collected, analysed and used,
and an understanding of the risks of opting in/out

Timing of Consent

Consent sought at the point of registration

for uses beyond those required for institutional reporting and basic student support
students will be largely unaware of learning analytics and how it may be used
most convenient for the institution, but arguable less meaningful for the student.
If consent is to be sought at this stage, it should be coupled with transparency (of purpose, of data collected, etc.) and potentially with a later option to withdraw consent
an expectation that users should consent to uses of personal data unknown at the point of registration seems to be an unreasonable and unethical one

Consent sought later

differentiate between initial consent for the collection of data and specific consent when data are used to intervene in the choices students have or/and in adapting their learning experience or access to resources is preferred.
opportunities to provide or withdraw consent where an intervention might significantly alter their experience.

Nature of the Data

Non-Sensitive Data

Sclater (2017): consent is not required (since this may be considered as of legitimate interest)
Prinsloo and Slade (2018) question that consent is not required

data is not a neutral construct, but is shaped by the ideas and contexts used to generate it
what may constitute non-sensitive data in one context may be considered sensitive in another

Sensitive Data (under the GDPR, will be labelled ‘special category data’)

Sclater (2017): consent is required for use of sensitive data
consent required for interventions directly with students on the basis of the analytics.

Comment

Though the discussion on consent covers a lot of ground, it lacks a basis in ethics, and should reference wider literature .

The requirement for consent may be based (for example) on a principle of 'Respect for Persons', as expressed by the Government of Canada panel on research ethics. This principle implies:

individuals who participate in research should do so voluntarily,
understanding the purpose of the research,
and its risks and potential benefits, as fully as reasonably possible.

The discussion should consider the information generally required for informed consent. This is described in detail in the panel on research ethics document.

Under no circumstances may researchers proceed to conduct research with anyone who has refused to participate.:

Consent shall be given voluntarily.
Consent can be withdrawn at any time.
If a participant withdraws consent, the participant can also request the withdrawal of their data or human biological materials.

These considerations would make moot distinctions on the basis the type of data, the timing of consent, or the ownership of the data (these are discussions of mechanisms rather than of ethics)

Coercion and incentives are significant factors in education, especially where a decision not to participate may harm a students educational outcome, either by refusal of admission, or dependence of the support on research data, or influence on grades or outcomes

It should be emphasized at this juncture that (a) ethics is wider than law - what may be legal may also be unethical, and (b) any use of data beyond direct provision of benefit to the student is a form of research, and ought properly be considered under the provisions of research ethics.

Student agency and responsibility

Power

there is an asymmetrical power-relationship between institutions and students

proactive engagement at least seeks to treat students as equal participants in the uses of their data.

Why engage students?

ensure that students understand their responsibility for keeping personal information up to date (and can give informed/meaningful consent)

achieve a more accurate interpretation of data relating to student behaviours

improve understanding of what forms of intervention and support are most appropriate

understand how to tailor a student’s learning journey to meet their needs, potentially as a personalised learning path

produce outcomes that students will find useful and be able to respond positively to, which might include a decision to continue or discontinue with their studies

Comment

Legally and ethically, with respect to learning analytics, it should be understood that students are equal participants with institutions, and not merely 'treated' as such

Moreover, while the decision to participate in analytics may create obligations on the part of the students, no such obligations exist a priori, and would need to be set out as part of informed consent

Similarly, knowledge of the risks and benefits of participation are part of the process of consent, not an optional and sometimes ad hoc engagement process.

From Alberta: The purpose of this Act is to govern the collection, use and disclosure of personal information by organizations in a manner that recognizes both the right of an individual to have his or her personal information protected and the need of organizations to collect, use or disclose personal information for purposes that are reasonable.

Comments

andersonSaturday, December 23, 2023
This comment has been removed by a blog administrator.
ReplyDelete
Replies
andersonSaturday, December 23, 2023
This comment has been removed by a blog administrator.
ReplyDelete
Replies

Add comment

Search This Blog

Half an Hour