No Space Left On Device Error

I submitted an Apache Spark application to an Amazon EMR cluster. The application fails with a 'no space left on device' stage failure like this:

No Space Left On Device Error Failed To Start Containers
No Space Left On Device Errors
Write Error No Space Left On Device
No Space Left On Device Error Reduction

Migration cleanup is mostly done. There are about 6k posts left to import, a few forum redirects to make, and minor details. All the heavy lifting is done. Errno 28 No space left on device vibs = VMwarelockertools-light6.5.0-0. Tried using script from here, worked for previous updates perfectly, but not this time. I have an MDS 9509 that is generating the message 'write error: No space left on device' whenever I perform a 'dir log:'. The output looks as follows: MDS9509# dir log: who: write error: No space left on device. 31 Jun 24 21: dmesg. 3313664 Jul 19 17: messages. 31999 Oct 17 19: startupdebug. Usage for log://sup-local.

Short Description

Spark uses local disks on the core and task nodes to store intermediate data. If the disks run out of space, the job fails with a 'no space left on device' error. Use one of the following methods to resolve this error:

Add more Amazon Elastic Block Store (Amazon EBS) capacity.
Add more Spark partitions.
Use a bootstrap action to dynamically scale up storage on the core and task nodes. For more information and a recommended bootstrap action script, see Dynamically scale up storage on Amazon EMR clusters.

Resolution

Add more EBS capacity

For new clusters: use larger EBS volumes

Launch an Amazon EMR cluster and choose an Amazon Elastic Compute Cloud (Amazon EC2) instance type with larger EBS volumes. For more information about the amount of storage and number of volumes allocated for each instance type, see Default EBS Storage for Instances.

For running clusters: add more EBS volumes

1. If larger EBS volumes don't resolve the problem, attach more EBS volumes to the core and task nodes.

2. Format and mount the attached volumes. Be sure to use the correct disk number (for example, /mnt1 or /mnt2 instead of /data).

3. Connect to the node using SSH.

No Space Left On Device Error Failed To Start Containers

4. Create a /mnt2/yarn directory, and thenset ownership of the directory to the YARN user:

5. Add the /mnt2/yarn directory inside the yarn.nodemanager.local-dirs property of /etc/hadoop/conf/yarn-site.xml. Example:

6. Restart the NodeManager service:

Add more Spark partitions

Depending on how many core and task nodes are in the cluster, consider increasing the number of Spark partitions. Use the following Scala code to add more Spark partitions:

Related Information

How can I troubleshoot stage failures in Spark jobs on Amazon EMR?

No Space Left On Device Errors

Anything we could improve?

Write Error No Space Left On Device

Need more help?

No Space Left On Device Error Reduction

Reactions Received
1
Posts
5
Hello,

This is on an Odroid HC2, root drive is a 32GB micro SD card, with a 4TB SATA data drive connected.

When trying to install some extra plugins (shellinabox, for example) I'm getting a 'No space left' error (28).

I'm not sure what or how the root partition is getting used up. I'm including screenshots showing the storage disks, and partitions. The data drive is LUKS encrypted.

Storage_Disks.png
Storage_File systems.png
Storage_Encryption.png

Below is what I get with the du command on the root folder.

Ideas...?

Thanks!

Display More
1. Navigation
2. Options
3. Current Location
1. User Menu
2. Language
3. English
This site uses cookies. By continuing to browse this site, you are agreeing to our use of cookies.
Your browser has JavaScript disabled. If you would like to use all features of this site, it is mandatory to enable JavaScript.