BLOG
|

4 min read

Columbia University Data Breach Exposes 460GB of Sensitive Records in Targeted Hack

Blog post featured image

Casmer Labs, Cloud Storage Security’s (CSS) internal threat research laboratory, closely monitors breaches and threats impacting cloud environments and particularly the data contained within. In this report, we analyze the June 2025 cyberattack on Columbia University —a significant breach against a higher education institution in terms of scope, impact, and complexity.

 

What Happened

On June 24, 2025, Columbia University experienced an outage across its Morningside campus. Core services including email, student information systems, digital signage, authentication infrastructure, and internal web platforms faltered due to the incident. While initially described as a technical failure, the university confirmed on July 1 that the event was a targeted cyberattack carried out by an external actor who had been active within Columbia’s systems for nearly two months prior to discovery.

The attacker, who has since self-identified as a political hacktivist, claimed to have breached Columbia’s systems to expose post-affirmative action admissions practices. According to their own disclosures and third-party validation, the attacker exfiltrated 460 GB of sensitive data.

 

What Was Stolen

The breach exposed significant volumes of personal (PII) and institutional data. The stolen files reportedly included admissions records for over 2.5 million applicants dating back decades, UNI credentials for more than 350,000 students and staff, and a trove of highly sensitive PII data including Social Security Numbers, passport scans, citizenship status, disciplinary records, financial aid data, and university payroll files.


A sample dataset of 1.6 GB, shared with Bloomberg and analyzed by cybersecurity researchers, confirmed the authenticity of the breach. In addition, the attacker published a queriable online registry of stolen UNI IDs, allowing users to check if their credentials were compromised.

 

Technical Breakdown of the Attack

Columbia University has not released specific details about how the attacker initially gained access, what tools or vulnerabilities were used, or how access was maintained over time. As of now, no confirmed indicators of compromise (IOCs), malware strains, or exploited vulnerabilities have been disclosed.

What is confirmed is that the attacker remained active within Columbia’s environment for an extended period and had access to internal systems. The scope of that access was broad enough to impact core services and reach peripheral systems, including campus digital signage and dormitory displays, which were temporarily altered to show politically themed messages and imagery.

A forensic investigation is currently underway, led by a third-party cybersecurity firm in coordination with law enforcement. Columbia has stated that there is no ongoing threat activity within its network.

 

Broader Implications

This was not a ransomware attack as no demands were made. Instead, it fits a growing pattern of data-centric, ideologically motivated breaches where the attacker’s goal is reputational damage, policy exposure, or political influence. The incident demonstrates a high level of technical competence: compromising hypervisors, bypassing multi-domain AD controls, and coordinating multi-gigabyte exfiltration without detection.

The scale and depth of this breach also suggest systemic weaknesses common in higher education: broad attack surfaces, under-segmented hybrid infrastructure, and legacy systems lacking close monitoring and protection of sensitive data at the storage layer.

 

Risk to Stakeholders

For students, alumni, and faculty, the most immediate threat is identity theft. Social Security Numbers, passport data, and financial records were compromised as a part of this breach. For Columbia’s administration, this incident represents a major institutional crisis —potentially triggering federal investigations under FERPA and New York’s SHIELD Act, in addition to reputational damage. 

 

Defensive Recommendations from CSS

In response to this event, Casmer Labs recommends that all higher education institutions, as well as other data-centric institutions evaluate their security posture —especially those running hybrid environments that mix traditional hypervisors, cloud storage, and legacy access systems.

Casmer Labs strongly recommends implementing in-tenant file-level and object-level scanning to detect malware and sensitive data exposure in real time. This includes scanning data at the point of ingestion, monitoring for anomalous access patterns, and implementing alerting workflows that can revoke permissions automatically when threats are detected.

Organizations must also ensure ESXi hosts are fully patched—particularly for CVE-2024-37085—and remove any Active Directory integration from hypervisor environments unless strictly necessary. MFA should be enforced at the hypervisor and domain levels, and all admin group modifications should be logged and reviewed. Outbound traffic to unfamiliar or foreign IP ranges should be monitored and rate-limited, especially for large or sporadic file transfers.

For connected IoT systems like signage displays, we recommend complete network segmentation, forced credential rotation, and the disabling of unnecessary remote access protocols such as SSH.

 

How CSS Helps

Cloud Storage Security offers two products designed to mitigate storage layer threats. Our AV solution provides in-tenant scanning that identifies, tags, and automatically remediates malware and ransomware entering your AWS, Azure and GCP storage resources. DataDefender expands on our award-winning malware protection solution with the goal of helping organizations prevent the exact type of breach Columbia experienced by offering real time data discovery and inventory, security checks on 90+ configuration options, and monitoring of all cloud storage resources —including Amazon S3, EBS, EFS, FSx, Azure Blob, and Google Cloud Storage Buckets. Together, AV and DataDefender provide customers comprehensive protection from storage-layer threats, whether it be malware/ransomware infiltration, mishandling of sensitive data, misconfigurations like publicly accessible S3 buckets, or suspicious activity like new rules being added to storage resources and unauthorized data downloads by users. 

DataDefender uses behavioral analytics, anomaly detection, and machine learning to detect and stop internal and external threats. It can be configured to automatically remove access permissions when an exfiltration attempt is detected—preventing sensitive data from leaving your environment in the first place. It also supports classification and redaction of high-risk data, aiding in both prevention and compliance.

 

Final Thoughts

The Columbia University breach exemplifies the risk for  deep attacks on hybrid infrastructure when visibility is limited, monitoring is minimal, and storage resources are unscanned and unprotected. While this attacker’s  motive may have been political, the techniques and access paths used are no different from those favored by financially motivated actors. Every institution with large-scale  data repositories should assume that these same weaknesses could be exploited against them next.


To speak with a security expert or request a private demo of DataDefender, visit cloudstoragesecurity.com/contact

Casmer Labs will continue to monitor this incident and provide updates as Columbia’s investigation and disclosures unfold.

angled bg image

Tired of Reading?

Want to watch something instead?

watch video blog cta image 614x261