Subscribe to our Newsletter

Featured Blog Posts – May 2016 Archive (3)

7 Tools to extract text from HTML document

I want to share an interesting article about data scaping that you might need in your business. The article below is mainly reprinted from here

Text in the HTML document is the content that placed between HTML tags like <a> </a> , <title> </title>. Sometimes we want to extract the text in the HTML document and there are two methods that can…

Continue

Added by Nora Choi on May 31, 2016 at 2:30am — No Comments

Hadoop Yarn explanation and container memory allocations

Yarn Resource manager (The Yarn service Master component)

1) Controls of the total resource capacity of the cluster

2) Whatever the container is needed in the cluster it sets the minimum container size that is controlled by yarn configuration property

àyarn.scheduler.minimum-allocation-mb 1024(This value changes based on cluster ram capacity)

Description: The minimum allocation for every container request at the RM, in MBs.…

Continue

Added by skumar T on May 30, 2016 at 8:00pm — No Comments

Data has always existed, the key is the right data

What does The Library of Alexandria, The Normans and a book have to do with data? I never thought about

The Library...

...at Alexandria was in charge of collecting all the world's knowledge, and most of the staff was occupied with the task of translating works onto papyrus paper... 1

Or The Normans and the...

Domesday Book (Latin: Liber de Wintonia "Book of…

Continue

Added by George Psistakis on May 20, 2016 at 5:20am — No Comments

On Data Science Central

© 2019   BigDataNews.com is a subsidiary of DataScienceCentral LLC and not affiliated with Systap   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service