Ebook640 pages2 hours

Julia for Data Science

Name: Julia for Data Science
Author: Anshul Joshi
ISBN: 9781783553860

By Anshul Joshi

Rating: 0 out of 5 stars

()

Read preview

About this ebook

About This Book

An in-depth exploration of Julia's growing ecosystem of packages
Work with the most powerful open-source libraries for deep learning, data wrangling, and data visualization
Learn about deep learning using Mocha.jl and give speed and high performance to data analysis on large data sets

Who This Book Is For

This book is aimed at data analysts and aspiring data scientists who have a basic knowledge of Julia or are completely new to it. The book also appeals to those competent in R and Python and wish to adopt Julia to improve their skills set in Data Science. It would be beneficial if the readers have a good background in statistics and computational mathematics.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateSep 30, 2016

ISBN9781783553860

Author

Anshul Joshi

Related authors

Skip carousel

Related to Julia for Data Science

Related ebooks

Skip carousel

Mastering Python Scientific Computing
Ebook
Mastering Python Scientific Computing
byMehta Hemant Kumar
Rating: 4 out of 5 stars
4/5
Scientific Computing with Python 3
Ebook
Scientific Computing with Python 3
byJan Erik Solem
Rating: 0 out of 5 stars
0 ratings
Julia: High Performance Programming
Ebook
Julia: High Performance Programming
byMalcolm Sherrington
Rating: 0 out of 5 stars
0 ratings
Machine Learning with R - Third Edition: Expert techniques for predictive modeling, 3rd Edition
Ebook
Machine Learning with R - Third Edition: Expert techniques for predictive modeling, 3rd Edition
byBrett Lantz
Rating: 0 out of 5 stars
0 ratings
Learning Data Mining with Python
Ebook
Learning Data Mining with Python
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Mastering Data Mining with Python – Find patterns hidden in your data
Ebook
Mastering Data Mining with Python – Find patterns hidden in your data
byMegan Squire
Rating: 0 out of 5 stars
0 ratings
Mastering Spark for Data Science
Ebook
Mastering Spark for Data Science
byAndrew Morgan
Rating: 0 out of 5 stars
0 ratings
Learning Predictive Analytics with R
Ebook
Learning Predictive Analytics with R
byMayor Eric
Rating: 0 out of 5 stars
0 ratings
Machine Learning with R - Second Edition
Ebook
Machine Learning with R - Second Edition
byBrett Lantz
Rating: 5 out of 5 stars
5/5
Practical Machine Learning
Ebook
Practical Machine Learning
byGollapudi Sunila
Rating: 2 out of 5 stars
2/5
Python for Finance
Ebook
Python for Finance
byYuxing Yan
Rating: 3 out of 5 stars
3/5
Scala for Machine Learning
Ebook
Scala for Machine Learning
byNicolas Patrick R.
Rating: 0 out of 5 stars
0 ratings
Mastering Python for Data Science
Ebook
Mastering Python for Data Science
bySamir Madhavan
Rating: 3 out of 5 stars
3/5
Django 1.1 Testing and Debugging
Ebook
Django 1.1 Testing and Debugging
byKaren M. Tracey
Rating: 4 out of 5 stars
4/5
Python Data Analysis
Ebook
Python Data Analysis
byIvan Idris
Rating: 4 out of 5 stars
4/5
Learning pandas
Ebook
Learning pandas
byHeydt Michael
Rating: 4 out of 5 stars
4/5
Deep Learning with R, Second Edition
Ebook
Deep Learning with R, Second Edition
byFrancois Chollet
Rating: 0 out of 5 stars
0 ratings
Python: Real-World Data Science
Ebook
Python: Real-World Data Science
byRobert Layton
Rating: 0 out of 5 stars
0 ratings
Mastering Objectoriented Python
Ebook
Mastering Objectoriented Python
bySteven F. Lott
Rating: 5 out of 5 stars
5/5
Pattern Recognition
Ebook
Pattern Recognition
byKonstantinos Koutroumbas
Rating: 4 out of 5 stars
4/5
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
Ebook
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
byBlaine Bateman
Rating: 0 out of 5 stars
0 ratings
Functional Python Programming
Ebook
Functional Python Programming
bySteven Lott
Rating: 0 out of 5 stars
0 ratings
Learning R Programming
Ebook
Learning R Programming
byKun Ren
Rating: 5 out of 5 stars
5/5
Building Machine Learning Systems with Python
Ebook
Building Machine Learning Systems with Python
byWilli Richert
Rating: 4 out of 5 stars
4/5
Machine Learning with R
Ebook
Machine Learning with R
byBrett Lantz
Rating: 4 out of 5 stars
4/5
Mastering Scientific Computing with R
Ebook
Mastering Scientific Computing with R
byPaul Gerrard
Rating: 3 out of 5 stars
3/5
Learning Predictive Analytics with Python
Ebook
Learning Predictive Analytics with Python
byKumar Ashish
Rating: 0 out of 5 stars
0 ratings
Data Science, Analytics and Machine Learning with R
Ebook
Data Science, Analytics and Machine Learning with R
byLuiz Paulo Favero
Rating: 0 out of 5 stars
0 ratings
Simulation for Data Science with R
Ebook
Simulation for Data Science with R
byMatthias Templ
Rating: 0 out of 5 stars
0 ratings
Python: Deeper Insights into Machine Learning
Ebook
Python: Deeper Insights into Machine Learning
byJohn Hearty
Rating: 0 out of 5 stars
0 ratings

Data Modeling & Design For You

Skip carousel

The Secrets of ChatGPT Prompt Engineering for Non-Developers
Ebook
The Secrets of ChatGPT Prompt Engineering for Non-Developers
byCea West
Rating: 5 out of 5 stars
5/5
Data Analytics for Beginners: Introduction to Data Analytics
Ebook
Data Analytics for Beginners: Introduction to Data Analytics
byAnthony S. Williams
Rating: 4 out of 5 stars
4/5
Python Machine Learning: A Practical Beginner's Guide to Understanding Machine Learning, Deep Learning and Neural Networks with Python, Scikit-Learn, Tensorflow and Keras
Ebook
Python Machine Learning: A Practical Beginner's Guide to Understanding Machine Learning, Deep Learning and Neural Networks with Python, Scikit-Learn, Tensorflow and Keras
byBrandon Railey
Rating: 0 out of 5 stars
0 ratings
Thinking in Algorithms: Strategic Thinking Skills, #2
Ebook
Thinking in Algorithms: Strategic Thinking Skills, #2
byAlbert Rutherford
Rating: 5 out of 5 stars
5/5
Microsoft 365 Excel: The Only App That Matters: Calculations, Analytics, Modeling, Data Analysis and Dashboard Reporting for the New Era of Dynamic Data Driven Decision Making & Insight
Ebook
Microsoft 365 Excel: The Only App That Matters: Calculations, Analytics, Modeling, Data Analysis and Dashboard Reporting for the New Era of Dynamic Data Driven Decision Making & Insight
byMike Girvin
Rating: 3 out of 5 stars
3/5
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
Ebook
Power Pivot and Power BI: The Excel User's Guide to DAX, Power Query, Power BI & Power Pivot in Excel 2010-2016
byRob Collie
Rating: 4 out of 5 stars
4/5
Mastering Agile User Stories
Ebook
Mastering Agile User Stories
byDeEtta Balthazar
Rating: 4 out of 5 stars
4/5
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
Ebook
Spreadsheets To Cubes (Advanced Data Analytics for Small Medium Business): Data Science
byalasdair gilchrist
Rating: 0 out of 5 stars
0 ratings
Data Visualization: a successful design process
Ebook
Data Visualization: a successful design process
byAndy Kirk
Rating: 4 out of 5 stars
4/5
Data Analytics with Python: Data Analytics in Python Using Pandas
Ebook
Data Analytics with Python: Data Analytics in Python Using Pandas
byFrank Millstein
Rating: 3 out of 5 stars
3/5
Supercharge Power BI: Power BI is Better When You Learn To Write DAX
Ebook
Supercharge Power BI: Power BI is Better When You Learn To Write DAX
byMatt Allington
Rating: 5 out of 5 stars
5/5
WordPress For Beginners - How To Set Up A Self Hosted WordPress Blog
Ebook
WordPress For Beginners - How To Set Up A Self Hosted WordPress Blog
byCyrus Jackson
Rating: 0 out of 5 stars
0 ratings
Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples
Ebook
Hacks To Crush Plc Program Fast & Efficiently Everytime... : Coding, Simulating & Testing Programmable Logic Controller With Examples
byMichael Blake
Rating: 5 out of 5 stars
5/5
Learn T-SQL Querying: A guide to developing efficient and elegant T-SQL code
Ebook
Learn T-SQL Querying: A guide to developing efficient and elegant T-SQL code
byPedro Lopes
Rating: 0 out of 5 stars
0 ratings
Graph Databases in Action: Examples in Gremlin
Ebook
Graph Databases in Action: Examples in Gremlin
byJosh Perryman
Rating: 0 out of 5 stars
0 ratings
R Programming - a Comprehensive Guide: Software
Ebook
R Programming - a Comprehensive Guide: Software
byEditor IJSMI
Rating: 0 out of 5 stars
0 ratings
Bayesian Analysis with Python
Ebook
Bayesian Analysis with Python
byOsvaldo Martin
Rating: 5 out of 5 stars
5/5
Neural Networks: Neural Networks Tools and Techniques for Beginners
Ebook
Neural Networks: Neural Networks Tools and Techniques for Beginners
byJohn Slavio
Rating: 5 out of 5 stars
5/5
The Systems Thinker - Mental Models: The Systems Thinker Series, #3
Ebook
The Systems Thinker - Mental Models: The Systems Thinker Series, #3
byAlbert Rutherford
Rating: 0 out of 5 stars
0 ratings
What Makes Us Smart: The Computational Logic of Human Cognition
Ebook
What Makes Us Smart: The Computational Logic of Human Cognition
bySamuel J. Gershman
Rating: 0 out of 5 stars
0 ratings
DAX Patterns: Second Edition
Ebook
DAX Patterns: Second Edition
byMarco Russo
Rating: 5 out of 5 stars
5/5
Deep Learning: An Essential Guide to Deep Learning for Beginners Who Want to Understand How Deep Neural Networks Work and Relate to Machine Learning and Artificial Intelligence
Ebook
Deep Learning: An Essential Guide to Deep Learning for Beginners Who Want to Understand How Deep Neural Networks Work and Relate to Machine Learning and Artificial Intelligence
byHerbert Jones
Rating: 5 out of 5 stars
5/5
Mastering Python Design Patterns
Ebook
Mastering Python Design Patterns
bySakis Kasampalis
Rating: 0 out of 5 stars
0 ratings
A Concise Guide to Object Orientated Programming
Ebook
A Concise Guide to Object Orientated Programming
byalasdair gilchrist
Rating: 0 out of 5 stars
0 ratings
R in Action, Third Edition: Data analysis and graphics with R and Tidyverse
Ebook
R in Action, Third Edition: Data analysis and graphics with R and Tidyverse
byRobert I. Kabacoff
Rating: 0 out of 5 stars
0 ratings
Living in Data: A Citizen's Guide to a Better Information Future
Ebook
Living in Data: A Citizen's Guide to a Better Information Future
byJer Thorp
Rating: 4 out of 5 stars
4/5
Tableau Desktop Certified Associate: Exam Guide: Develop your Tableau skills and prepare for Tableau certification with tips from industry experts
Ebook
Tableau Desktop Certified Associate: Exam Guide: Develop your Tableau skills and prepare for Tableau certification with tips from industry experts
byDmitry Anoshin
Rating: 0 out of 5 stars
0 ratings
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
Ebook
Machine Learning: A Comprehensive, Step-by-Step Guide to Learning and Understanding Machine Learning Concepts, Technology and Principles for Beginners: 1
byPeter Bradley
Rating: 0 out of 5 stars
0 ratings
Machine Learning Interview Questions
Ebook
Machine Learning Interview Questions
byTech Interviews
Rating: 5 out of 5 stars
5/5
Mastering Redis
Ebook
Mastering Redis
byJeremy Nelson
Rating: 0 out of 5 stars
0 ratings

Related podcast episodes

Skip carousel

#111 The Rise of the Julia Programming Language
Podcast episode
#111 The Rise of the Julia Programming Language
byDataFramed
0 ratings
0% found this document useful
Learning Long-Time Dependencies with RNNs w/ Konstantin Rusch - #484: Today we conclude our 2021 ICLR coverage joined by Konstantin Rusch, a PhD Student at ETH Zurich. In our conversation with Konstantin, we explore his recent papers, titled coRNN and uniCORNN respectively, which focus on a novel architecture of...
Podcast episode
Learning Long-Time Dependencies with RNNs w/ Konstantin Rusch - #484: Today we conclude our 2021 ICLR coverage joined by Konstantin Rusch, a PhD Student at ETH Zurich. In our conversation with Konstantin, we explore his recent papers, titled coRNN and uniCORNN respectively, which focus on a novel architecture of...
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
Podcast episode
[MINI] Long Short Term Memory: Thanks to our sponsor brilliant.org/dataskeptics A Long Short Term Memory (LSTM) is a neural unit, often used in Recurrent Neural Network (RNN) which attempts to provide the network the capacity to store information for longer periods of time. An...
byData Skeptic
0 ratings
0% found this document useful
Recurrent Neural Nets: This week, we're doing a crash course in recurren…
Podcast episode
Recurrent Neural Nets: This week, we're doing a crash course in recurren…
byLinear Digressions
0 ratings
0% found this document useful
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
Podcast episode
Episode 19 (Python for Data Science - Python Files - Scripts and Modules)
byHow to Data (Joshiverse- Journey of a Budding Data Scientist)
0 ratings
0% found this document useful
#51 Francois Chollet - Intelligence and Generalisation
Podcast episode
#51 Francois Chollet - Intelligence and Generalisation
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
Podcast episode
This Week In Machine Learning & AI - 5/20/16: AI at Google I/O, Amazon's Deep Learning DSSTNE: This Week In Machine Learning & AI - May 20, 2016…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
#10 Data Science, the Environment and MOOCs: Air pollution, the environment and data science: where do these intersect? Find out in this episode of DataFramed, in which Hugo speaks with Roger Peng, Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health...
Podcast episode
#10 Data Science, the Environment and MOOCs: Air pollution, the environment and data science: where do these intersect? Find out in this episode of DataFramed, in which Hugo speaks with Roger Peng, Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health...
byDataFramed
0 ratings
0% found this document useful
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
Podcast episode
Exploring deep reinforcement learning: with Thomas Simonini of Hugging Face
byPractical AI: Machine Learning, Data Science
0 ratings
0% found this document useful
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
Podcast episode
Open Source TensorFlow with Yifei Feng: Yifei Feng, a TensorFlow software engineer, shares with Melanie and Mark about her work on the open source TensorFlow project and the tools she builds.
byGoogle Cloud Platform Podcast
100%
100% found this document useful
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
Podcast episode
Putting Airflow Into Production With James Meickle - Episode 43: Lessons Learned While Building A Data Science Platform With Airflow (Interview)
byData Engineering Podcast
0 ratings
0% found this document useful
Advantages of Completing Small Python Projects
Podcast episode
Advantages of Completing Small Python Projects
byThe Real Python Podcast
0 ratings
0% found this document useful
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
Podcast episode
040: Graph Databases: Traditional relational databases like MySQL or Postgres are really good at providing many solutions to the problem of persisting state. But these types of database are really horrible at querying highly connected models in an efficient way. Graph datab...
byPHPRoundtable Podcast
0 ratings
0% found this document useful
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
Podcast episode
#70 Beyond the Language Wars: R & Python for the Modern Data Scientist
byDataFramed
0 ratings
0% found this document useful
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
Podcast episode
What is beyond PoCs? ML project-hurdles you should be prepared to take with Balázs Kégl - 016: Why do we do PoCs all the time and why do we struggle with Real projects? We are going to talk about ML project-hurdles with the head of AI at Huawei Paris, Balazs Kegl.
byMachine Learning Cafe
0 ratings
0% found this document useful
Alexandr Draganov, "Mathematical Tools for Real-World Applications: A Gentle Introduction for Students and Practitioners" (MIT Press, 2022): An interview with Alexandr Draganov
Podcast episode
Alexandr Draganov, "Mathematical Tools for Real-World Applications: A Gentle Introduction for Students and Practitioners" (MIT Press, 2022): An interview with Alexandr Draganov
byNew Books in Mathematics
0 ratings
0% found this document useful
Making Automated Machine Learning More Accessible With EvalML: An interview with Angela Lin and Jeremy Shih about the open source EvalML framework for building automated machine learning workflows.
Podcast episode
Making Automated Machine Learning More Accessible With EvalML: An interview with Angela Lin and Jeremy Shih about the open source EvalML framework for building automated machine learning workflows.
byThe Python Podcast.__init__
100%
100% found this document useful
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
Podcast episode
433: Falling for FastAPI: Mike's falling in love with FastAPI and gives us a hint at the next project he's building.
byCoder Radio
0 ratings
0% found this document useful
Measuring Your Python Learning Progress
Podcast episode
Measuring Your Python Learning Progress
byThe Real Python Podcast
100%
100% found this document useful
Do You Dare Run Your ML Experiments in Production? with Ville Tuulos - #523
Podcast episode
Do You Dare Run Your ML Experiments in Production? with Ville Tuulos - #523
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Unraveling Python's Syntax to Its Core With Brett Cannon
Podcast episode
Unraveling Python's Syntax to Its Core With Brett Cannon
byThe Real Python Podcast
100%
100% found this document useful
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
Podcast episode
This Week In Machine Learning & AI - 5/27/16: The White House on AI & Aggressive Self-Driving Cars: This Week in Machine Learning & AI brings you the…
byThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
0 ratings
0% found this document useful
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
Podcast episode
Level Up Your Data Platform With Active Metadata: A conversation with Atlan co-founder Prukalpa Sankar about the idea of active metadata and how it can reduce the toil involved in managing a data platform
byData Engineering Podcast
0 ratings
0% found this document useful
#059 - 10 Python clean code tips drawn from code reviews
Podcast episode
#059 - 10 Python clean code tips drawn from code reviews
byPybites Podcast
0 ratings
0% found this document useful
#10 Exploratory Analysis of Bayesian Models, with ArviZ and Ari Hartikainen
Podcast episode
#10 Exploratory Analysis of Bayesian Models, with ArviZ and Ari Hartikainen
byLearning Bayesian Statistics
0 ratings
0% found this document useful
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
Podcast episode
Unlocking The Power of Data Lineage In Your Platform with OpenLineage: An interview with Julien Le Dem about the OpenLineage specification and the opportunity that it offers for simplifying the tracking and analysis of data lineage across your data platform.
byData Engineering Podcast
0 ratings
0% found this document useful
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
Podcast episode
Ali Ghodsi – The Past, Present, and Future of Big Data – [Founder’s Field Guide, EP.18]: My Guest today is Ali Ghodsi, founder and CEO of Databricks, a data analytics platform for data scientists and developers. He's also the founder of Apache Spark, the open-source project that Databricks is built on, and is an accomplished researcher at...
byInvest Like the Best with Patrick O'Shaughnessy
0 ratings
0% found this document useful
A Programmer's Guide to Computer Science with Dr. William Springer: Have you failed a job interview because you don't know computer science? William Springer has a PhD in computer science and his books takes you through what you would have learned while earning a four-year computer science degree! Both Scott and William believe in breaking down boundaries, and it starts with this show!
Podcast episode
A Programmer's Guide to Computer Science with Dr. William Springer: Have you failed a job interview because you don't know computer science? William Springer has a PhD in computer science and his books takes you through what you would have learned while earning a four-year computer science degree! Both Scott and William believe in breaking down boundaries, and it starts with this show!
byHanselminutes with Scott Hanselman
100%
100% found this document useful
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
Podcast episode
Python, Django, and Channels: with Andrew Godwin, creator of Django Channels
byThe Changelog: Software Development, Open Source
0 ratings
0% found this document useful
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
Podcast episode
55: Go on The Web: Summary Andrew Gerrand (@enneff), Developer Advocate at Google & Go core contributor, talks about GoLang and how it is being used in Web Development today as well as the plans for the future of the Go as a platform for the web. Resources Go...
byThe Web Platform Podcast
100%
100% found this document useful

Skip carousel

Scikit-Learn: The Ultimate Python Library
APC
Article
Scikit-Learn: The Ultimate Python Library
Jul 15, 2019
4 min read
Comparing Time Series Data Like A Pro
Linux Format
Article
Comparing Time Series Data Like A Pro
Jun 1, 2021
8 min read
Tensor Flow 101
APC
Article
Tensor Flow 101
Jan 27, 2020
4 min read
How Image Recognition Works
APC
Article
How Image Recognition Works
Nov 4, 2019
4 min read
Build A Static Analysis Development Pipeline
Linux Format
Article
Build A Static Analysis Development Pipeline
Jul 27, 2021
9 min read
Manipulate Data Like A Pro With Pandas
Linux Format
Article
Manipulate Data Like A Pro With Pandas
Jul 27, 2021
7 min read
Deep Learning Tests Billions Of Graphene Combos In 2 Days
Futurity
Article
Deep Learning Tests Billions Of Graphene Combos In 2 Days
Apr 11, 2019
2 min read
Why Is Electron So Controversial?
PC Pro Magazine
Article
Why Is Electron So Controversial?
Oct 8, 2022
5 min read
The Race To Exascale Supercomputers
Maximum PC
Article
The Race To Exascale Supercomputers
Jun 21, 2022
9 min read
Machine Learning – With Zero Programming
APC
Article
Machine Learning – With Zero Programming
Aug 12, 2019
6 min read
Data Analysis
Linux Format
Article
Data Analysis
Mar 10, 2020
Sometimes you receive raw data that needs to be processed before plotting. In Veusz, look under the Data > Operations menu and find lots of options for manipulating data sets. Joining, merging, finding the average, filtering and many more are availab
1 min read
How Spooky Science Helps Us Peer Inside The Planets
All About Space
Article
How Spooky Science Helps Us Peer Inside The Planets
Dec 3, 2020
An assistant professor of computational science at the EPFL research centre in Lausanne, Switzerland, involved in the current research on metallic hydrogen. Could you explain how the machine-learning techniques used in your research work? Why were th
1 min read
Web App Security
Linux Format
Article
Web App Security
Jun 29, 2021
8 min read
Observe The Observers
Linux Format
Article
Observe The Observers
Sep 22, 2020
“Observability comes from control theory and means that you should be able to determine or infer a system’s state by its output. For some setups, this can involve using strace and grepping through logs, but the number of machines that DBA’s are deali
1 min read
Code A Cataloguing Application In Python
Linux Format
Article
Code A Cataloguing Application In Python
Nov 15, 2022
Credit: www.djangoproject.com Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://github.com/mat
8 min read
Generative AI: What Leaders Need To Know
Rotman Management
Article
Generative AI: What Leaders Need To Know
Jan 1, 2024
12 min read
Machine-learning On Your Android Phone?
APC
Article
Machine-learning On Your Android Phone?
Dec 30, 2019
4 min read
Risky Giant Steps Can Solve Optimization Problems Faster
Nautilus
Article
Risky Giant Steps Can Solve Optimization Problems Faster
Sep 5, 2023
4 min read
How And Where You Use Machine-learning
APC
Article
How And Where You Use Machine-learning
Oct 7, 2019
4 min read
Overall Usefulness
Linux Format
Article
Overall Usefulness
Sep 22, 2020
3 min read
Circuit Programs Human Cells to Add and Subtract
Futurity
Article
Circuit Programs Human Cells to Add and Subtract
Apr 15, 2017
A new platform offers a fast and more efficient way to target and program mammalian cells as genetic circuits, even complex ones. “The problem synthetic biologists are trying to solve is how we ask cells to make decisions and try to design a strategy
2 min read
Google Answer Box Strategy
Techfastly
Article
Google Answer Box Strategy
Sep 21, 2020
Leveraging the Google PAA (People Also Ask) element on a Search Results Page for Targeted Content Creation with a Python Scraper All businesses that are online today are creating content at a furious pace. According to Technavio, a research firm, con
7 min read
Deep Learning Technique for Object Detection
Techfastly
Article
Deep Learning Technique for Object Detection
Jun 1, 2021
3 min read
Grid Modeling Overview: Four Types of Models Guiding the Transition to Clean Electricity
Union of Concerned Scientists
Article
Grid Modeling Overview: Four Types of Models Guiding the Transition to Clean Electricity
Apr 25, 2022
6 min read
Producer’s Guide
Future Music
Article
Producer’s Guide
Aug 22, 2019
4 min read
Get Into Gentoo
Linux Format
Article
Get Into Gentoo
May 4, 2021
7 min read
Help Yourself To Avoid These Pitfalls
MacLife
Article
Help Yourself To Avoid These Pitfalls
Dec 11, 2018
GETTING UP TO full speed with the Shortcuts app takes time, and you’ll inevitably make a few mistakes along the way. Having to troubleshoot your efforts doesn’t mean you’ve failed — with years of experience, even professional programmers do this. Tak
2 min read
3d Animation Basics 06
3D World
Article
3d Animation Basics 06
Jul 17, 2019
3 min read
Elektron Model:Samples £410
Future Music
Article
Elektron Model:Samples £410
Jun 27, 2019
5 min read
Integrated Workplace Management Systems
Facility Management
Article
Integrated Workplace Management Systems
Dec 23, 2018
Property and facilities management are data-rich operating worlds. This is becoming even more complex as the Internet of Things (IoT) provides the capability to imbed sensors and diagnostic tools to monitor the use and performance of everything in re
4 min read

Related categories

Skip carousel

Reviews for Julia for Data Science

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Julia for Data Science - Anshul Joshi

Julia for Data Science

Credits

About the Author

About the Reviewer

www.PacktPub.com

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Downloading the color images of this book

Errata

Piracy

Questions

1. The Groundwork – Julia's Environment

Julia is different

Setting up the environment

Installing Julia (Linux)

Installing Julia (Mac)

Installing Julia (Windows)

Exploring the source code

Using REPL

Using Jupyter Notebook

Package management

Pkg.status() – package status

Pkg.add() – adding packages

Working with unregistered packages

Pkg.update() – package update

METADATA repository

Developing packages

Creating a new package

Parallel computation using Julia

Julia's key feature – multiple dispatch

Methods in multiple dispatch

Ambiguities – method definitions

Facilitating language interoperability

Calling Python code in Julia

Summary

References

2. Data Munging

What is data munging?

The data munging process

What is a DataFrame?

The NA data type and its importance

DataArray – a series-like data structure

DataFrames – tabular data structures

Installation and using DataFrames.jl

Writing the data to a file

Working with DataFrames

Understanding DataFrames joins

The Split-Apply-Combine strategy

Reshaping the data

Sorting a dataset

Formula - a special data type for mathematical expressions

Pooling data

Web scraping

Summary

References

3. Data Exploration

Sampling

Population

Weight vectors

Inferring column types

Basic statistical summaries

Calculating the mean of the array or dataframe

Scalar statistics

Standard deviations and variances

Measures of variation

Z-scores

Entropy

Quantiles

Modes

Summary of datasets

Scatter matrix and covariance

Computing deviations

Rankings

Counting functions

Histograms

Correlation analysis

Summary

References

4. Deep Dive into Inferential Statistics

Installation

Understanding the sampling distribution

Understanding the normal distribution

Parameter estimation

Type hierarchy in Distributions.jl

Understanding Sampleable

Representing probabilistic distributions

Univariate distributions

Retrieving parameters

Statistical functions

Evaluation of probability

Sampling in Univariate distributions

Understanding Discrete Univariate distributions and types

Bernoulli distribution

Binomial distribution

Continuous distributions

Cauchy distribution

Chi distribution

Chi-square distribution

Truncated distributions

Truncated normal distributions

Understanding multivariate distributions

Multinomial distribution

Multivariate normal distribution

Dirichlet distribution

Understanding matrixvariate distributions

Wishart distribution

Inverse-Wishart distribution

Distribution fitting

Distribution selection

Symmetrical distributions

Skew distributions to the right

Skew distributions to the left

Maximum Likelihood Estimation

Sufficient statistics

Maximum-a-Posteriori estimation

Confidence interval

Interpreting the confidence intervals

Usage

Understanding z-score

Interpreting z-scores

Understanding the significance of the P-value

One-tailed and two-tailed test

Summary

References

5. Making Sense of Data Using Visualization

Difference between using and importall

Pyplot for Julia

Multimedia I/O

Installation

Basic plotting

Plot using sine and cosine

Unicode plots

Installation

Examples

Generating Unicode scatterplots

Generating Unicode line plots

Visualizing using Vega

Installation

Examples

Scatterplot

Heatmaps in Vega

Data visualization using Gadfly

Installing Gadfly

Interacting with Gadfly using plot function

Example

Using Gadfly to plot DataFrames

Using Gadfly to visualize functions and expressions

Generating an image with multiple layers

Generating plots with different aesthetics using statistics

The step function

The quantile-quantile function

Ticks in Gadfly

Generating plots with different aesthetics using Geometry

Boxplots

Using Geometry to create density plots

Using Geometry to create histograms

Bar plots

Histogram2d - the two-dimensional histogram

Smooth line plot

Subplot grid

Horizontal and vertical lines

Plotting a ribbon

Violin plots

Beeswarm plots

Elements - scale

x_continuous and y_continuous

x_discrete and y_discrete

Continuous color scale

Elements - guide

Understanding how Gadfly works

Summary

References

6. Supervised Machine Learning

What is machine learning?

Uses of machine learning

Machine learning and ethics

Machine learning – the process

Different types of machine learning

What is bias-variance trade-off?

Effects of overfitting and underfitting on a model

Understanding decision trees

Building decision trees – divide and conquer

Where should we use decision tree learning?

Advantages of decision trees

Disadvantages of decision trees

Decision tree learning algorithms

How a decision tree algorithm works

Understanding and measuring purity of node

An example

Supervised learning using Naïve Bayes

Advantages of Naïve Bayes

Disadvantages of Naïve Bayes

Uses of Naïve Bayes classification

How Bayesian methods work

Posterior probabilities

Class-conditional probabilities

Prior probabilities

Evidence

The bag of words

Advantages of using Naïve Bayes as a spam filter

Disadvantages of Naïve Bayes filters

Examples of Naïve Bayes

Summary

References

7. Unsupervised Machine Learning

Understanding clustering

How are clusters formed?

Types of clustering

Hierarchical clustering

Overlapping, exclusive, and fuzzy clustering

Differences between partial versus complete clustering

K-means clustering

K-means algorithm

Algorithm of K-means

Associating the data points with the closest centroid

How to choose the initial centroids?

Time-space complexity of K-means algorithms

Issues with K-means

Empty clusters in K-means

Outliers in the dataset

Different types of cluster

K-means – strengths and weaknesses

Bisecting K-means algorithm

Getting deep into hierarchical clustering

Agglomerative hierarchical clustering

How proximity is computed

Strengths and weaknesses of hierarchical clustering

Understanding the DBSCAN technique

So, what is density?

How are points classified using center-based density

DBSCAN algorithm

Strengths and weaknesses of the DBSCAN algorithm

Cluster validation

Example

Summary

References

8. Creating Ensemble Models

What is ensemble learning?

Understanding ensemble learning

How to construct an ensemble

Combination strategies

Subsampling training dataset

Bagging

When does bagging work?

Boosting

Boosting approach

Boosting algorithm

AdaBoost – boosting by sampling

What is boosting doing?

The bias and variance decomposition

Manipulating the input features

Injecting randomness

Random forests

Features of random forests

How do random forests work?

The out-of-bag (oob) error estimate

Gini importance

Proximities

Implementation in Julia

Learning and prediction

Why is ensemble learning superior?

Applications of ensemble learning

Summary

References

9. Time Series

What is forecasting?

Decision-making process

The dynamics of a system

What is TimeSeries?

Trends, seasonality, cycles, and residuals

Difference from standard linear regression

Basic objectives of the analysis

Types of models

Important characteristics to consider first

Systematic pattern and random noise

Two general aspects of time series patterns

Trend analysis

Smoothing

Fitting a function

Analysis of seasonality

Autocorrelation correlogram

Examining correlograms

Partial autocorrelations

Removing serial dependency

ARIMA

Common processes

ARIMA methodology

Identification

Estimation and forecasting

The constant in ARIMA models

Identification phase

Seasonal models

Parameter estimation

Evaluation of the model

Interrupted time series ARIMA

Exponential smoothing

Simple exponential smoothing

Indices of lack of fit (error)

Implementation in Julia

The TimeArray time series type

Using time constraints

when

from

findwhen

find

Mathematical, comparison, and logical operators

Applying methods to TimeSeries

Lag

Lead

Percentage

Combining methods in TimeSeries

Merge

Collapse

Map

Summary

References

10. Collaborative Filtering and Recommendation System

What is a recommendation system?

The utility matrix

Association rule mining

Measures of association rules

How to generate the item sets

How to generate the rules

Content-based filtering

Steps involved in content-based filtering

Advantages of content-based filtering

Limitations of content-based filtering

Collaborative filtering

Baseline prediction methods

User-based collaborative filtering

Item-item collaborative filtering

Algorithm of item-based collaborative filtering

Building a movie recommender system

Summary

11. Introduction to Deep Learning

Revisiting linear algebra

A gist of scalars

A brief outline of vectors

The importance of matrices

What are tensors?

Probability and information theory

Why probability?

Differences between machine learning and deep learning

What is deep learning?

Deep feedforward networks

Understanding the hidden layers in a neural network

The motivation of neural networks

Understanding regularization

Optimizing deep learning models

The case of optimization

Implementation in Julia

Network architecture

Types of layers

Neurons (activation functions)

Understanding regularizers for ANN

Norm constraints

Using solvers in deep neural networks

Coffee breaks

Image classification with pre-trained Imagenet CNN

Summary

References

Julia for Data Science

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: September 2016

Production reference: 1260916

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham

B3 2PB, UK.

ISBN 978-1-78528-969-9

www.packtpub.com

Credits

About the Author

Anshul Joshi is a data science professional with more than 2 years of experience primarily in data munging, recommendation systems, predictive modeling, and distributed computing. He is a deep learning and AI enthusiast. Most of the time, he can be caught exploring GitHub or trying anything new on which he can get his hands on. He blogs on anshuljoshi.xyz.

I'd like to thank my parents, who have been really supportive throughout, my professors, who helped me during my days at university and got me where I am, and my friends, who were very understanding. A big thanks to the Julia community. These people are amazing and are the rockstars of our generation.

I would also like to thank Packt Publishing and the editors for helping me throughout. A special thanks to Sébastien Celles; his expertise and reviews really helped me improve the book.

About the Reviewer

Sébastien Celles is a professor of applied physics at Poitiers Institute of Technology (Université de Poitiers—IUT de Poitiers—thermal science department). He teaches physics and computer sciences (data processing).

He has used Python for numerical simulations, data plotting, data predicting, and various other tasks since the early 2000s. He is a member of PyData and was granted commit rights to the pandas DataReader project. He is also involved in several open source projects about the scientific Python ecosystem.

He is also author of some Python packages available on PyPi:

openweathermap_requests: A package to fetch data from http://openweathermap.org/ using requests and requests-cache and get pandas DataFrames with weather history

pandas_degreedays: A package to calculate degree days (a measure of heating or cooling) from a pandas time series of temperature

pandas_confusion: A package to manage confusion matrices, plot them, binarize them, calculate overall statistics, and class statistics

He made some contributions (unit testing, continuous integration, Python 3 port…) too:

python-constraint: A Constraint Solving Problem (CSP) resolver for Python

He was a technical reviewer of Mastering Python for Data Science explores the world of data science through Python and learn how to make sense of data. Samir Madhavan. Birmingham, UK, Packt Publishing, August 2015.

Two years ago, he started to learn Julia, with which he has performed various tasks about data mining, machine learning, forecasting, and so he's a user of (and sometimes a contributor too) some Julia packages (DataStructures.jl, CSV.jl, DataFrames.jl, TimeSeries.jl, NDSparseData.jl, JuliaTS.jl, MLBase.jl, Mocha.jl, and so on)

He is also author of some Julia packages:

Pushover.jl: A package to send notifications using the Pushover Notification Service

BulkSMS.jl: A Julia package to send SMS (Short Message Service) using BulkSMS API

DataReaders.jl: A package to get remote data via Requests.jl and get DataFrames thanks to DataFrames.jl

RequestsCache.jl: A transparent persistent cache using the Requests.jl library to perform requests and using JLD.jl library as a storage backend

PubSub.jl: A very basic implementation of the publish-subscribe pattern

SignalSlot.jl: A very basic implementation of the signal-slot pattern

TALib.jl: A Julia wrapper for TA-Lib (Technical Analysis Library)

He has a keen interest in open data and he is a contributor of some projects of the Open Knowledge Foundation (especially around the DataPackage format).

You can find more information about him at http://www.celles.net/wiki/Contact.

www.PacktPub.com

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at service@packtpub.com for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www.packtpub.com/mapt

Get the most in-demand software skills with Mapt. Mapt gives you full access to all Packt books and video courses, as well as industry-leading tools to help you plan your personal development and advance your career.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print, and bookmark content

On demand and accessible via a web browser

To my sister—I hope I made you a little proud.

Preface

Data Scientist: The Sexiest Job of the 21st Century, Harvard Business Review. And why Julia? A high level language with large scientific community and performance comparable to C, it is touted as next best language for data science. Using Julia, we can create statistical models, highly performant machine learning systems, and beautiful and attractive visualizations.

What this book covers

Chapter 1, The Groundwork – Julia’s Environment, explains how to set up the Julia’s environment (Command Line(REPL) and Jupyter Notebook) and explains Julia’s ecosystem, why Julia is special, and package management. It also gives an introduction to parallel processing and multiple dispatch and explains how Julia is suited for data science.

Chapter 2, Data Munging, explains the need for and process of data preparation, also called data munging. Data munging refers to changing data from one state to other, in well-defined reversible steps. It is preparing data to be used for analytics and visualizations.

Chapter 3, Data Exploration, explains that statistics is the core of data science, shows that Julia provides various statistical functions. This chapter will give a high-level overview of statistics and will explain the techniques required to apply those statistical concepts to general problems using Julia’s statistical packages, such as Stats.jl and Distributions.jl.

Chapter 4, Deep Dive into Inferential Statistics, continues statistics is the core of the data science and is Julia provides various statistical functions. This chapter will give high level overview of advance statistics and then will explain the techniques to apply those statistical concepts on general problems using Julia’s statistical packages such as Stats.jl and Distributions.jl.

Chapter 5, Making Sense of Data Using Visualization, explains why data visualization is essential part of data science and how it makes communicating the results more effective and reaches out to larger audience. This chapter will go through the Vega, Asciiplot, and Gadfly packages of Julia, which are used for data visualization.

Chapter 6, Supervised Machine Learning, says A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E – Tom M. Mitchell. Machine learning is a field of study that gives computers the ability to learn and enhance without being explicitly programmed. This chapter will explain that Julia is a high-level language with a great performance, and is nicely suited for machine learning. This chapter will focus on supervised machine learning algorithms such as Naive Bayes, regression, and decision trees.

Chapter 7, Unsupervised Machine Learning, explains that unsupervised learning is a little bit different and harder than supervised learning. The aim is to get the system to learn something but we don’t know what it will learn. This chapter will focus on unsupervised learning algorithms such as clustering.

Chapter 8, Creating Ensemble Models, explains that a group of people has the ability to take better decisions than a single individual, especially when each group member comes in with their own biases. This is also true for machine learning. This chapter will focus on a machine learning technique called ensemble learning, an example being random forest.

Chapter 9, Time Series, shows the capacity to demonstrate and perform decision modeling, and explains that examination is a crucial component of some real-world applications running from emergency medical treatment in intensive care units to military command and control frameworks. This chapter focuses on time series data and forecasting using Julia.

Chapter 10, Collaborative Filtering and Recommendation System, explains that every day we are confronted with decisions and choices. These can range from our clothes to the movies we watch or what to eat when we order online. We take decisions in business too. For instance, which stock should we invest in? What if decision making could be automated, and suitable recommendations could be given to us. This chapter focuses on recommendation systems and techniques such as collaborative filtering and association rule mining.

Chapter 11, Introduction to Deep Learning, explains that deep learning refers to a class of machine learning techniques that do unsupervised or supervised feature extraction and pattern analysis or classification by exploiting multiple layers of non-linear information processing. This chapter will introduce us to deep learning in Julia. Deep learning is a new branch of machine learning with one goal – Artificial Intelligence. We will also learn about Julia's framework, Julia's, Mocha.jl, with which we can implement deep learning.

What you need for this book

The reader will requires a system (64-bit recommended) having a fairly recent operating system (Linux, Windows 7+, and Mac OS) with a working Internet connection and privileges to install Julia, Git and various packages used in the book.

Who this book is for

The standard demographic is data analysts and aspiring data scientists with little to no grounding in the fundamentals of the Julia language, who are looking to explore how to conduct data science with Julia's ecosystem of packages. On top of this are competent Python or R users looking to leverage Julia to enhance the efficiency of their ability to conduct data science. A good background in statistics and computational mathematics is expected.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: Julia also provides another function, summarystats().

A block of code is set as follows:

ci(x::HypothesisTests.FisherExactTest)

ci(x::HypothesisTests.FisherExactTest, alpha::Float64)

ci(x::HypothesisTests.TTest)

ci(x::HypothesisTests.TTest, alpha::Float64)

Any command-line input or output is written as follows:

julia> Pkg.update()

julia> Pkg.add(StatsBase)

New terms and important words are shown in bold.

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book-what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail feedback@packtpub.com, and mention the book's title in

Enjoying the preview?

Page 1 of 1

Julia for Data Science

About this ebook

Anshul Joshi

Related authors

Related to Julia for Data Science

Related ebooks

Data Modeling & Design For You

Related podcast episodes

Related articles

Related categories

Reviews for Julia for Data Science

What did you think?

Book preview

Julia for Data Science - Anshul Joshi

Table of Contents

Julia for Data Science

Julia for Data Science

About the Author

About the Reviewer

www.PacktPub.com

Why subscribe?

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Note

Tip

Reader feedback