About me

Paolo Iannelli Picture

Paolo Iannelli

Sr. Software Engineer / Software Architect at LeaseWeb

Location
Amsterdam Area, Netherlands
Industry
Information Technology and Services
Interests
Python, Big Data, Scalability, High Availability, Performance
Description
Senior Software Engineer and Architect with more than 8 years experience.
Strong in critical thinking, problem solving and high performance architectures.
Paolo Iannelli Labs Rss

Save job results directly to DDFS with Disco

Posted on : 10-05-2012 | By : Paolo Iannelli | In : BigData

Tags: , , , , , , , ,

0

If you are looking for a way to save directly the processed data with disco in a DDFS tag, you only need to add this extra parameter:

save=True

to you run method.

Example :

After the execution of the job you will be able to find already stored in DDFS a tag named disco:job:results:myNiceJob@xxxx:xxxx:xxxx that you can use as an input for other jobs.

WARNING: is not documented anywhere, but using this option will add a read/write token to the result tag that will limit access to it. If you need to re-use the content of such tag, you will always have to specify the token “job” (that’s the automatically assigned name) to access it. It took me couple of hours of deep investigation to understand why I was getting the error :

Unable to access resource (http://localhost:8989/ddfs/tag/disco:job:results:MyNiceJob@538:9c3fb:b9f17): Incorrect or missing token. (401)

Have fun !