BLOG
|

3 min read

No File Left Behind

no file left behind featured image

Does your organization manage extra large files via workflows using Amazon S3 that should be scanned for viruses or malware to meet compliance requirements and eliminate downstream risk but that can't be? 

Perhaps as a media company you want to ensure any video files received from third parties are infection free before storing or working with them, but didn't have a way to scan the files because their size. Or as a life sciences organization, you want to ensure all data sets you're ingesting and making available for use are free of ransomware, but couldn't because of the volume of data contained in a singe file.     

No more!  To meet customer demand, Antivirus for Amazon S3 now supports scanning files up to 5 TB in size. This is perfect for health care organizations, defense departments and any other industry that has really BIG files. Never skip scanning a file again due to size with our Extra Large File Scanning functionality.

 

The Technical Specifics

Most Antivirus for Amazon S3 customers scan their files under the Fargate disk cap of 200 GB. But those in industries that have files larger than 200 GB and require malware scanning need an alternative option that can manage more than that. To meet that need, Antivirus for Amazon S3 now supports scanning files up to 5 TB in size (Amazon S3's max file size). This is done by bypassing the internal disk limitations of Fargate leveraging an EC2 instance(s) for such files.

Extra large file scanning can be triggered by one of the scanning models we offer: event based scanning, retro scanning and even API based scanning (scan existing API only). When any of these scanning agents picks up a file that is too large to scan (too large is based on the disk size users assign under the Agent Settings or API Agent Settings within the console) and the Extra Large File Scanning toggle is on (snapshot from the console below), a Job is defined to be kicked off.

Extra Large File

The job will be picked up within 10 minutes and kicked off. A temporary EC2 instance will be spun up with an EBS volume of the size defined in the Disk Size field. The EC2 will pick up the file and scan it. Because it is a "job", it is monitored under the Jobs page within Antivirus for Amazon S3. On the Jobs page you can monitor the job going through "Not Started" while waiting for the EC2 to start up, "Scanning", and "Completed". Each "large file" is treated as its own job. If you have 50 large files come in then 50 jobs will be kicked off for the duration it takes to scan each individual file.

 

Sample Scenarios and Scanning Outcomes

(Note: subtract 5 GB for overhead from disk sizes for agents and Extra Large File Scan size)

Scenario Scanning Outcome
File is smaller than defined scanning agent disk size Scanning agent picks file up to scan
Scan Result is whatever the outcome of file is
File is larger than defined scanning agent disk size
Extra Large File Scanning is off
Scanning agent rejects file and does not even attempt to scan it
Scan Result is set to Unscannable
File is larger than defined scanning agent disk size
Extra Large File Scanning is on
Scanning agent creates an Extra Large File Scan Job and moves on to the next file
Large File Job shows up on the Jobs page and is kicked off within 10 minutes of creation
Scan Result is set to whatever the outcome of the file is
File is larger than Extra Large File Scanning disk size
Extra Large File Scanning is on
Scan Result is set to Unscannable
 
 

But Wait, It Can Be Used For More

"Extra Large File Scanning" doesn't have to be leveraged for only really big files, but can be used to scan any size file over the disk size you have assigned to the standard scanning agents we provide. For example, let's say you occasionally need to process 50 GB files, but it isn't worth it to you to keep a larger disk attached to every scanning agent running. So you keep the default disk size of 20 GB, and have the Extra Large File Scanning toggle within Antivirus for Amazon S3 switched on.

Any file 15 GB and smaller will be processed by the scanning agent, but any file greater than 15 GB in size will be scanned by the extra large file scanning process. This can ensure that no file is ever skipped due to size, but if large files are rare in your system you don't have to sit on the expense of a larger default disk.

 
 
angled bg image

Tired of Reading?

Want to watch something instead?

watch video blog cta image 614x261