Apache Spark

Zaharia M., Chambers B. Spark: The Definitive Guide: Big data processing made simple

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2018. — 608 p. — ISBN: 978-1491912218. Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of this open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique...

№1
8,60 МБ
добавлен 24.02.2018 21:35
описание отредактировано 25.02.2018 02:35

Подробнее

Karau H., Konwinski A., Wendell P., Zaharia M. Learning Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2015. — 274 p. — e-ISBN: 978-1-4493-5904-1, ISBN10: 1-4493-5904-3. Data in all domains is getting bigger. How can you work with it efficiently? This book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java,...

№2
7,82 МБ
добавлен 03.04.2015 14:04
описание отредактировано 16.06.2017 19:29

Подробнее

Jurney Russell. Agile Data Science 2.0: Building Full-Stack Data Analytics Applications with Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2017. — 352 p. — ISBN: 978-1491960110. Data science teams looking to turn research into useful analytics applications require not only the right tools, but also the right approach if they’re to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn how to use the Agile Data Science development methodology to...

№3
11,53 МБ
добавлен 13.06.2017 22:34
описание отредактировано 16.06.2017 19:29

Подробнее

Karau H., Warren R. High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2017. — 358 p. — ISBN: 978-1491943205. True PDF Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries...

№4
7,00 МБ
добавлен 31.08.2017 10:33
описание отредактировано 29.10.2020 03:27

Подробнее

Damji Jules, Wenig Brooke, Das Tathagata, Lee Denny. Learning Spark: Lightning-Fast Data Analytics

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

2nd Edition. — O’Reilly Media, 2020. — 398 р. Data is getting bigger, arriving faster, and coming in varied formats—and it all needs to be processed at scale for analytics or machine learning. How can you process such varied data workloads efficiently? Enter Apache Spark. Updated to emphasize new features in Spark 2.x., this second edition shows data engineers and scientists...

№5
14,72 МБ
добавлен 01.07.2020 17:17
описание отредактировано 02.07.2020 01:13

Подробнее

Estrada R., Ruiz I. Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress, 2016. — 296 p. — ISBN: 9781484221747 This book is about how to integrate full-stack open source big data architecture and how to choose the correct technology—Scala/Spark, Mesos, Akka, Cassandra, and Kafka—in every layer. Big data architecture is becoming a requirement for many different enterprises. So far, however, the focus has largely been on collecting,...

№6
4,60 МБ
добавлен 25.10.2016 20:41
описание отредактировано 16.06.2017 19:29

Подробнее

Ryza S., Laserson U., Owen S., Wills J. Advanced Analytics with Spark: Patterns for Learning from Data at Scale

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

2nd ed. — O’Reilly Media, 2017. — 283 p. — ASIN B072KFWZ8S. In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. You’ll start with an...

№7
3,81 МБ
добавлен 16.06.2017 01:47
описание отредактировано 06.08.2022 21:26

Подробнее

Pentreath Nick. Machine Learning with Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2015. — 338 p. — e-ISBN: 978-1-78328-852-6, ISBN10: 1-78328-852-3 Apache Spark is a framework for distributed computing that is designed from the ground up to be optimized for low latency tasks and in-memory data storage. It is one of the few frameworks for parallel computing that combines speed, scalability, in-memory processing, and fault tolerance with ease...

№8
4,74 МБ
добавлен 08.04.2015 11:08
описание отредактировано 16.06.2017 19:29

Подробнее

Perrin Jean-Georges. Spark in Action (Final)

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

2nd Edition. — Manning Publications, 2020. — 577 p. — ISBN: 978-1617295522. Spark in Action, Second Edition is an entirely new book that teaches you everything you need to create end-to-end analytics pipelines in Spark. Rewritten from the ground up with lots of helpful graphics, you’ll learn the roles of DAGs and dataframes, the advantages of “lazy evaluation”, and ingestion...

№9
19,56 МБ
добавлен 19.05.2020 14:36
описание отредактировано 19.05.2020 16:31

Подробнее

Venkat Ankam. Big Data Analytics: A handy reference guide for data analysts and data scientists to help to obtain value from big data analytics using Spark on Hadoop clusters

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2016. — 326 p. — ISBN: 9781785884696 Big Data Analytics book aims at providing the fundamentals of Apache Spark and Hadoop. All Spark components – Spark Core, Spark SQL, DataFrames, Data sets, Conventional Streaming, Structured Streaming, MLlib, Graphx and Hadoop core components – HDFS, MapReduce and Yarn are explored in greater depth with implementation...

№10
6,53 МБ
добавлен 25.10.2016 12:58
описание отредактировано 16.06.2017 19:29

Подробнее

Ryza S. et al. Advanced Analytics with Spark: Patterns for Learning from Data at Scale

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2015. — 275 p. — ISBN: 1491912766, 9781491912768 Ryza S., Laserson U., Owen S., Wills J. In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems...

№11
4,03 МБ
добавлен 28.05.2015 17:17
описание отредактировано 06.08.2022 21:26

Подробнее

Бутаков Н.А., Петров М.В., Насонов Д. Обработка больших данных с Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Учебно-методическое пособие. — СПб.: Университет ИТМО, 2019. — 50 с. Учебно-методическое пособие содержит теоретический материал и примеры выполнения задач для курса «Введение в технологии обработки больших данных». Пособие составлено с учётом проведения лабораторных работ с помощью фреймворка Apache Spark. Содержание дисциплины охватывает круг вопросов, связанных с организацией...

№12
2,81 МБ
добавлен 26.04.2019 16:09
описание отредактировано 26.04.2019 17:19

Подробнее

Abbasi M.A. Learning Apache Spark 2

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 356 p. — ISBN: 978-1785885136. True PDF Key Features Exclusive guide that covers how to get up and running with fast data processing using Apache Spark Explore and exploit various possibilities with Apache Spark using real-world use cases in this book Want to perform efficient data processing at real time? This book will be your one-stop solution. Book...

№13
10,72 МБ
добавлен 16.06.2017 19:23
описание отредактировано 17.06.2017 03:25

Подробнее

Haines Scott. Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress Media LLC, 2022. — 595 p. — ISBN-13: 978-1-4842-7451-4. Leverage Apache Spark within a modern data engineering ecosystem. This hands-on guide will teach you how to write fully functional applications, follow industry best practices, and learn the rationale behind these decisions. With Apache Spark as the foundation, you will follow a step-by-step journey beginning with...

№14
7,80 МБ
добавлен 24.03.2022 13:49
описание отредактировано 24.03.2022 15:00

Подробнее

Aven Jeffrey. Sams Teach Yourself Apache Spark in 24 Hours

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Sams Publishing, 2017. — 592 p. — ISBN13: 978-0-672-33851-9. Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. In just 24 lessons of one hour or less, Sams Teach Yourself Apache Spark in 24 Hours helps you build practical Big Data solutions that...

№15
36,56 МБ
добавлен 06.05.2017 16:20
описание отредактировано 16.06.2017 19:29

Подробнее

Maas Gerard, Garillot François. Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2019. — 452 p. — ISBN13: 978-1-491-94424-0. Before you can build analytics tools to gain quick insights, you first need to know how to process data in real time. With this practical guide, developers familiar with Apache Spark will learn how to put this in-memory framework to use for streaming data. You’ll discover how Spark enables you to write streaming jobs...

№16
8,26 МБ
добавлен 10.05.2020 04:49
описание отредактировано 31.10.2020 23:00

Подробнее

Amirghodsi S., Rajendran M., Hall B., Mei S. Apache Spark 2.x Machine Learning Cookbook: Over 100 recipes to simplify machine learning model implementations with Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 666 p. — ASIN B01BKL1PD8. Simplify machine learning model implementations with Spark About This Book Solve the day-to-day problems of data science with Spark This unique cookbook consists of exciting and intuitive numerical recipes Optimize your work by acquiring, cleaning, analyzing, predicting, and visualizing your data Who This Book Is For This book...

№17
22,66 МБ
добавлен 08.10.2017 20:28
описание отредактировано 09.10.2017 01:23

Подробнее

Thomas A. Natural Language Processing with Spark NLP: Learning to Understand Text at Scale

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly, 2020. — 367 p. — ISBN: 1-492-04776-6. If you want to build an enterprise-quality application that uses natural language text but aren’t sure where to begin or what tools to use, this practical guide will help get you started. Alex Thomas, principal data scientist at Wisecube, shows software engineers and data scientists how to build scalable natural language...

№18
8,87 МБ
добавлен 06.09.2020 10:47
описание отредактировано 10.09.2020 07:24

Подробнее

Nandi Amit. Spark for Python Developers: A concise guide to implementing Spark Big Data analytics for Python developers, and building a real-time and insightful trend tracker data intensive app

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2015. - 206p. Looking for a cluster computing system that provides high-level APIs? Apache Spark is your answer—an open source, fast, and general purpose cluster computing system. Spark's multi-stage memory primitives provide performance up to 100 times faster than Hadoop, and it is also well-suited for machine learning algorithms. Are you a Python developer...

№19
9,43 МБ
добавлен 30.01.2016 20:10
описание отредактировано 16.06.2017 19:29

Подробнее

Parsian M. Data Algorithms with Spark: Recipes and Design Patterns for Scaling Up using PySpark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2022. — 435 p. — ISBN 1492082384. Apache Spark's speed, ease of use, sophisticated analytics, and multilanguage support makes practical knowledge of this cluster-computing framework a required skill for data engineers and data scientists . With this hands-on guide, anyone looking for an introduction to Spark will learn practical algorithms and examples using...

№20
12,58 МБ
добавлен 05.06.2022 12:31
описание отредактировано 06.06.2022 04:22

Подробнее

Zubair Nabi. Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress, 2016. — 231 р. — ISBN: 978-1-4842-4800. Learn the right cutting-edge skills and knowledge to leverage Spark Streaming to implement a wide array of real-time, streaming applications. Pro Spark Streaming walks you through end-to-end real-time application development using real-world applications, data, and code. Taking an application-first approach, each chapter...

№21
13,41 МБ
добавлен 16.06.2016 23:23
описание отредактировано 16.06.2017 19:29

Подробнее

Bikramaditya Singhal, Srinivas Duvvuri. Spark for Data Science

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2016. — 339 p. — ISBN: 1785885650. — ASIN: B01CGKAILW. Key Features Perform data analysis and build predictive models on huge datasets that leverage Apache Spark Learn to integrate data science algorithms and techniques with the fast and scalable computing features of Spark to address big data challenges Work through practical examples on real-world problems...

№22
13,00 МБ
добавлен 08.10.2016 22:11
описание отредактировано 16.06.2017 19:29

Подробнее

Iozzia G. Hands-On Deep Learning with Apache Spark: Build and deploy distributed deep learning applications on Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt, 2019. — 322 p. — ISBN: 1788994613. Speed up the design and implementation of deep learning solutions using Apache Spark Deep learning is a subset of machine learning where datasets with several layers of complexity can be processed. Hands-On Deep Learning with Apache Spark addresses the sheer complexity of technical and analytical parts and the speed at which deep...

№23
12,94 МБ
добавлен 10.03.2019 14:07
описание отредактировано 11.03.2019 09:32

Подробнее

Malak M., East R. Spark GraphX in Action

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Manning Publications, 2016. — 282 p. in color. — ISBN: 1617292524, 9781617292521 Spark GraphX in Action starts out with an overview of Apache Spark and the GraphX graph processing API. This example-based tutorial then teaches you how to configure GraphX and how to use it interactively. Along the way, you'll collect practical techniques for enhancing applications and applying...

№24
17,16 МБ
добавлен 01.07.2016 00:56
описание отредактировано 16.06.2017 19:29

Подробнее

Galeano M.I.F. Big Data Processing with Apache Spark: Efficiently tackle large datasets and big data analysis with Spark and Python

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2018. — 142 p. — ASIN B07HRTNFZ9. No need to spend hours ploughing through endless data – let Spark, one of the fastest big data processing engines available, do the hard work for you. Key Features Get up and running with Apache Spark and Python Integrate Spark with AWS for real-time analytics Apply processed data streams to machine learning APIs of Apache...

№25
2,38 МБ
добавлен 25.12.2018 13:43
описание отредактировано 26.12.2018 00:10

Подробнее

Захария М., Венделл П., Конвински Э., Карау Х. Изучаем Spark. Молниеносный анализ данных

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

ДМК Пресс, 2015. — 303 c. — ISBN: 5970603236, 9785970603239 Объем обрабатываемых данных во всех областях человеческой деятельности продолжает расти быстрыми темпами. Существуют ли эффективные приемы работы с ним? В этой книге рассказывается об Apache Spark, открытой системе кластерных вычислений, которая позволяет быстро создавать высокопроизводительные программы анализа...

№26
15,69 МБ
добавлен 11.02.2016 19:21
описание отредактировано 16.06.2017 19:29

Подробнее

Kumar S., Gulati S. Apache Spark 2.x for Java Developers

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 350 p. — ASIN B01LY3N7ZO. Key Features Perform big data processing with Spark—without having to learn Scala! Use the Spark Java API to implement efficient enterprise-grade applications for data processing and analytics Go beyond mainstream data processing by adding querying capability, Machine Learning, and graph processing using Spark Book Description...

№27
7,95 МБ
добавлен 15.08.2017 19:39
описание отредактировано 26.08.2017 21:30

Подробнее

Yadav Rishi. Apache Spark 2.x Cookbook

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 294 p. — ISBN13: 9781787127265. True PDF Over 70 recipes to help you use Apache Spark as your single big data computing platform and master its libraries. While Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and...

№28
14,55 МБ
добавлен 03.08.2017 18:39
описание отредактировано 03.08.2017 18:52

Подробнее

Bonaci Marko, Zecevic Petar. Spark in Action

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Manning Publications, 2016. — 472 p. Big data systems distribute datasets across clusters of machines, making it a challenge to efficiently query, stream, and interpret them. Spark can help. It is a processing system designed specifically for distributed data. It provides easy-to-use interfaces, along with the performance you need for production-quality analytics and machine...

№29
10,96 МБ
добавлен 19.04.2018 16:05
описание отредактировано 20.04.2018 04:17

Подробнее

Mittal M. et al. Big Data Processing Using Spark in Cloud

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Springer, 2018. — 274 p. — ISBN: 9811305498. The book describes the emergence of big data technologies and the role of Spark in the entire big data stack. It compares Spark and Hadoop and identifies the shortcomings of Hadoop that have been overcome by Spark. The book mainly focuses on the in-depth architecture of Spark and our understanding of Spark RDDs and how RDD...

№30
8,49 МБ
добавлен 16.06.2018 13:53
описание отредактировано 17.06.2018 00:39

Подробнее

Frampton Mike. Mastering Apache Spark: Gain expertise in processing and storing data by using advanced techniques with Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2015. - 318p. Apache Spark is an in-memory cluster based parallel processing system that provides a wide range of functionality like graph processing, machine learning, stream processing and SQL. It operates at unprecedented speeds, is easy to use and offers a rich set of data transformations. This book aims to take your limited knowledge of Spark to the...

№31
17,76 МБ
добавлен 30.10.2015 18:05
описание отредактировано 16.06.2017 19:29

Подробнее

Sarkar Aurobindo. Learning Spark SQL

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 452 p. — ISBN: 978-1-78588-835-9. Design, implement, and deliver successful streaming applications, machine learning pipelines and graph applications using Spark SQL API In the past year, Apache Spark has been increasingly adopted for the development of distributed applications. Spark SQL APIs provide an optimized interface that helps developers build...

№32
40,58 МБ
добавлен 16.09.2017 19:46
описание отредактировано 16.09.2017 23:02

Подробнее

Luu Hien. Beginning Apache Spark 3: With DataFrame, Spark SQL, Structured Streaming, and Spark Machine Learning Library

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

2nd Edition. — Apress Media, LLC, 2021. — 445 p. — ISBN-13: 978-1-4842-7382-1. Take a journey toward discovering, learning, and using Apache Spark 3.0. In this book, you will gain expertise on the powerful and efficient distributed data processing engine inside of Apache Spark; its user-friendly, comprehensive, and flexible programming model for processing data in batch and...

№33
8,39 МБ
добавлен 22.10.2021 19:54
описание отредактировано 23.10.2021 02:11

Подробнее

Kaysar M., Karim R. Large Scale Machine Learning with Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt, 2016. — 501 p. — ISBN: 978-1-78588-874-8 True PDF Get the most up-to-date book on the market that focuses on design, engineering, and scalable solutions in machine learning with Spark 2.0.0 Use Spark's machine learning library in a big data environment You will learn how to develop high-value applications at scale with ease and a develop a personalized design Who This...

№34
11,47 МБ
добавлен 19.06.2017 03:05
описание отредактировано 22.05.2020 00:15

Подробнее

Databricks. Using Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Мануал от компании Databricks по использованию Apache Spark. Log Analysis with Spark Introduction to Apache Spark Importing Data Exporting Data Log Analyzer Application Twitter Streaming Language Classifier Collect a Dataset of Tweets Examine the Tweets and Train a Model Apply the Model in Real-time

№35
556,28 КБ
добавлен 26.01.2016 19:41
описание отредактировано 16.06.2017 19:29

Подробнее

Frampton M. Complete Guide to Open Source Big Data Stack

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress, 2018. — 375 p. — ISBN: 978-1-4842-2148-8. See a Mesos-based big data stack created and the components used. You will use currently available Apache full and incubating systems. The components are introduced by example and you learn how they work together. In the , the author begins by creating a private cloud and then installs and examines Apache Brooklyn. After that,...

№36
9,49 МБ
добавлен 18.01.2018 21:39
описание отредактировано 18.01.2018 22:14

Подробнее

Luu H. Beginning Apache Spark 2

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress, 2018. — 393 p. — ISBN: 978-1484235782. Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. Beginning Apache Spark 2 gives you an introduction to Apache Spark and shows you how to work with it. Along the way, you’ll...

№37
5,55 МБ
добавлен 17.08.2018 12:27
описание отредактировано 18.08.2018 03:13

Подробнее

Chitturi Padma Priya. Apache Spark for Data Science Cookbook

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2016. — 392 p. — ISBN: 1785880101. True PDF Spark has emerged as the most promising big data analytics engine for data science professionals. The true power and value of Apache Spark lies in its ability to execute data science tasks with speed and accuracy. Spark’s selling point is that it combines ETL, batch analytics, real-time stream analysis, machine...

№38
4,51 МБ
добавлен 10.10.2017 17:01
описание отредактировано 10.10.2017 19:00

Подробнее

Ankam V. Big Data Analytics (+code)

archive
pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing Ltd., Birmingham, UK, 2016. — 325 p. — ISBN: 9781785884696. A handy reference guide for data analysts and data scientists to help to obtain value from big data analytics using Spark on Hadoop clusters Big Data Analytics book aims at providing the fundamentals of Apache Spark and Hadoop. All Spark components – Spark Core, Spark SQL, DataFrames, Data sets,...

№39
10,21 МБ
добавлен 02.05.2017 15:00
описание отредактировано 16.06.2017 19:29

Подробнее

García Alfonso Antolínez. Hands-on Guide to Apache Spark 3: Build Scalable Computing Engines for Batch and Stream Data Processing

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress Media, LLC, 2023. — 416 p. — ISBN-13: 978-1-4842-9379-9. This book explains how to scale Apache Spark 3 to handle massive amounts of data, either via batch or streaming processing. It covers how to use Spark’s structured APIs to perform complex data transformations and analyses you can use to implement end-to-end analytics workflows. This book covers Spark 3’s new...

№40
8,09 МБ
добавлен 08.06.2023 14:40
описание отредактировано 09.06.2023 00:45

Подробнее

Morgan Andrew et al. Mastering Spark for Data Science

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 550 p. Master the techniques and sophisticated analytics used to construct Spark-based solutions that scale to deliver production-grade data science products. Data science seeks to transform the world using data, and this is typically achieved through disrupting and changing real processes in real industries. In order to operate at this level you need...

№41
34,99 МБ
добавлен 15.01.2018 01:04
описание отредактировано 15.01.2018 05:28

Подробнее

Kienzler Romeo. Mastering Apache Spark 2.x

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

2nd Edition. — Packt Publishing, 2017. — 345 p. Apache Spark is an in-memory, cluster-based, parallel processing system that provides a wide range of functionality such as graph processing, machine learning, stream processing, and SQL. This book aims to take your limited knowledge of Spark to the next level by teaching you how to expand your Spark functionality. The book opens...

№42
29,65 МБ
добавлен 14.01.2018 23:36
описание отредактировано 15.01.2018 03:31

Подробнее

Liu A. Apache Spark Machine Learning Blueprints

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2016. - 252p. - ASIN: B01GEUF1H6 True PDF Key Features Customize Apache Spark and R to fit your analytical needs in customer research, fraud detection, risk analytics, and recommendation engine development Develop a set of practical Machine Learning applications that can be implemented in real-life projects A comprehensive, project-based guide to improve and...

№43
4,22 МБ
добавлен 05.10.2017 13:49
описание отредактировано 05.10.2017 16:31

Подробнее

Perez Marco. Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Independently published, 2021. — 301 p. — ASIN B0959QYBSW. Distributed Processing for Massive Datasets About the Author About the Technical Reviewer Part I: Getting Started Understanding Apache Spark An Example The Core Use Cases Transform Your Data Analyze Your Data Machine Learning NET for Apache Spark Feature Parity Setting Up Spark Choosing Your Software Versions Choosing a...

№44
2,76 МБ
добавлен 19.05.2021 11:34
описание отредактировано 19.05.2021 22:51

Подробнее

Ilijason Robert. Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress, 2020. — 281 p. — ISBN13: 978-1-4842-5780-7. Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical...

№45
2,80 МБ
добавлен 11.06.2020 16:51
описание отредактировано 11.06.2020 17:07

Подробнее

Kane F. Frank Kane's Taming Big Data with Apache Spark and Python

archive
pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 296 p. — ASIN B071VVFDMP. True PDF +Sample files Key Features Understand how Spark can be distributed across computing clusters Develop and run Spark jobs efficiently using Python A hands-on tutorial by Frank Kane with over 15 real-world examples teaching you Big Data processing with Spark Book Description Frank Kane's Taming Big Data with Apache Spark...

№46
11,83 МБ
добавлен 05.10.2017 13:29
описание отредактировано 31.01.2024 13:54

Подробнее

Penchikala Srini. Big Data Processing with Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

NY: InfoQ, 2018. — 104 p. Apache Spark is an open-source big-data processing framework built around speed, ease of use, and sophisticated analytics. Spark has several advantages compared to other big-data and MapReduce technologies like Hadoop and Storm. It provides a comprehensive, unified framework with which to manage big-data processing requirements for datasets that are...

№47
2,45 МБ
добавлен 31.05.2018 16:42
описание отредактировано 28.01.2019 01:49

Подробнее

Subhashini Chellappan, Dharanitharan Ganesan. Practical Apache Spark: Using the Scala API

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress, 2019. — 288 p. — ISBN: 1484236513, 9781484236512. Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical...

№48
23,31 МБ
добавлен 11.02.2019 16:01
описание отредактировано 11.02.2019 16:28

Подробнее

Dua Rajdeep. Machine Learning with Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

2nd ed. — Packt Publishing, 2017. — 532 p. — ISBN: 978-1-78588-993-6. True PDF Create scalable machine learning applications to power a modern data-driven business using Spark 2.x This book will teach you about popular machine learning algorithms and their implementation. You will learn how various machine learning concepts are implemented in the context of Spark ML. You will...

№49
20,03 МБ
добавлен 03.08.2017 18:36
описание отредактировано 22.05.2020 00:15

Подробнее

Maas Gerard, Garillot François. Stream Processing with Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2019. — 156 р. — ISBN: 1491944242. To build analytics tools that provide faster insights, knowing how to process data in real time is a must, and moving from batch processing to stream processing is absolutely required. Fortunately, the Spark in-memory framework/platform for processing data has added an extension devoted to fault-tolerant stream processing:...

№50
3,89 МБ
добавлен 11.04.2019 20:47
описание отредактировано 08.11.2020 05:52

Подробнее

Perrin Jean-Georges. Spark in Action (MEAP)

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

2nd Edition. — Manning Publications, 2020. — 629 p. — ISBN: 978-1617295522. Spark in Action, Second Edition is an entirely new book that teaches you everything you need to create end-to-end analytics pipelines in Spark. Rewritten from the ground up with lots of helpful graphics, you’ll learn the roles of DAGs and dataframes, the advantages of “lazy evaluation”, and ingestion...

№51
21,79 МБ
добавлен 28.02.2020 22:24
описание отредактировано 28.02.2020 23:04

Подробнее

Karau H., Bida A., Polak A., Warren R. High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

2nd Edition (Second Early Release) — O’Reilly Media, 2024. — 350 p. — ISBN: 9780137957002. Apache Spark is amazing when everything clicks. But if you haven't seen the performance improvements you expected or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau, Rachel Warren, and Anya Bida walk you through the...

№52
1,04 МБ
добавлен 28.05.2024 17:42
описание отредактировано 28.05.2024 17:43

Подробнее

Sourav Gulati, Sumit Kumar. Apache Spark 2.x for Java Developers

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 350 p. — ISBN: 978-1-78712-649-7. Unleash the data processing and analytics capability of Apache Spark with the language of choice: Java Apache Spark is the buzzword in the big data industry right now, especially with the increasing need for real-time streaming and data processing. While Spark is built on Scala, the Spark Java API exposes all the Spark...

№53
8,02 МБ
добавлен 16.08.2017 02:19
описание отредактировано 05.05.2018 05:09

Подробнее

Dev Athul. Spark with Python

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Independently published, 2020. — 154 p. Nowadays the internet is an integral part of our life, right from the waking moment we indulge in the world of the internet like creating a Facebook post or watch a YouTube video or so, and in this process we tend to create data. And think of it as the entire human population participating in this process of creating data every day, every...

№54
6,02 МБ
добавлен 10.06.2020 16:19
описание отредактировано 10.06.2020 18:02

Подробнее

Thottuvaikkatumana Rajanarayanan. Apache Spark 2 for Beginners

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

True PDF Packt Publishing, 2016. — 322 p. — ISBN: 978-1-78588-500-6. Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools that are equally useful for application developers as well as data scientists. This book starts with the fundamentals of Spark 2 and covers the core data processing framework and...

№55
21,81 МБ
добавлен 28.10.2017 00:01
описание отредактировано 28.10.2017 07:33

Подробнее

Ganelin Ilya, Orhian Ema, Sasaki Kai, York Brennon. Spark: big data cluster computing in production

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Indianapolis, IN : Wiley, 2016. — 205 p. — ISBN: 978-1-119-25404-1. Spark: Big Data Cluster Computing in Production goes beyond general Spark overviews to provide targeted guidance toward using lightning-fast big-data clustering in production. Written by an expert team well-known in the big data community, this book walks you through the challenges in moving from...

№56
5,91 МБ
добавлен 21.03.2019 22:55
описание отредактировано 28.07.2019 02:35

Подробнее

Karau Holden, Warren Rachel. High Performance Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

O’Reilly Media, 2017. — 358 p. — ISBN: 978-1-491-94320-5. Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run...

№57
5,68 МБ
добавлен 10.06.2017 20:27
описание отредактировано 16.06.2017 19:29

Подробнее

Tellez Alex, Pumperla Max, Malohlava Michal. Mastering Machine Learning with Spark 2.x

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 323 p. — ISBN: 978-1-78528-345-1. Unlock the complexities of machine learning algorithms in Spark to generate useful data insights through this data analysis tutorial The purpose of machine learning is to build systems that learn from data. Being able to understand trends and patterns in complex data is critical to success; it is one of the key...

№58
12,78 МБ
добавлен 12.03.2018 01:52
описание отредактировано 03.04.2018 01:59

Подробнее

Kukreja Manoj. Data Engineering with Apache Spark, Delta Lake, and Lakehouse

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2021. — 480 p. — ISBN 1801077746, 9781801077743. Understand the complexities of modern-day data engineering platforms and explore strategies to deal with them with the help of use case scenarios led by an industry expert in big data Key Features Become well-versed with the core concepts of Apache Spark and Delta Lake for building data platforms Learn how to...

№59
16,48 МБ
добавлен 14.11.2021 22:40
описание отредактировано 10.02.2022 02:45

Подробнее

Thottuvaikkatumana Rajanarayanan. Spark 2.0 for Beginners

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2016. — 322 p. — ISBN: 1785885006. Spark is one of the most widely-used large-scale data processing engines and runs extremely fast. It is a framework that has tools which that are equally useful for application developers as well as data scientists. SparkR or “R on Spark” in the Spark framework opened the door of Spark data processing capability to the R...

№60
23,57 МБ
добавлен 19.10.2016 11:19
описание отредактировано 07.06.2020 05:15

Подробнее

Ed Elliott. Introducing .NET for Apache Spark: Distributed Processing for Massive Datasets

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Sussex: Apress, 2021. — 269 р. — ISBN: 978-1-4842-6991-6 Get started using Apache Spark via C# or F# and the .NET for Apache Spark bindings. This book is an introduction to both Apache Spark and the .NET bindings. Readers new to Apache Spark will get up to speed quickly using Spark for data processing tasks performed against large and very large datasets. You will learn how to...

№61
4,49 МБ
добавлен 14.04.2021 03:37
описание отредактировано 14.04.2021 03:45

Подробнее

Karim Rezaul, Alla Sridhar. Scala and Spark for Big Data Analytics

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 874 p. — ISBN10: 1785280848, 13 978-1785280849. Scala has been observing wide adoption over the past few years, especially in the field of data science and analytics. Spark, built on Scala, has gained a lot of recognition and is being used widely in productions. Thus, if you want to leverage the power of Scala and Spark to make sense of big data, this...

№62
86,48 МБ
добавлен 16.10.2017 15:34
описание отредактировано 23.08.2018 15:33

Подробнее

Karau Holden, Sankar Krishna. Fast Data Processing with Spark 2

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

3rd Ed. — Packt Publishing, 2016. — 269 p. — ISBN: 1785889273. When people want a way to process Big Data at speed, Spark is invariably the solution. With its ease of development (in comparison to the relative complexity of Hadoop), it’s unsurprising that it’s becoming popular with data analysts and engineers everywhere. Beginning with the fundamentals, we’ll show you how to...

№63
31,61 МБ
добавлен 29.10.2016 13:32
описание отредактировано 07.06.2020 04:38

Подробнее

Michael Armbrust. Spark SQL - Relational Data Processing in Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

SIGMOD '15 Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data Spark SQL is a new module in Apache Spark that integrates relational processing with Spark’s functional programming API. Built on our experience with Shark, Spark SQL lets Spark programmers leverage the benefits of relational processing (e.g., declarative queries and optimized storage),...

№64
536,69 КБ
добавлен 25.09.2017 03:14
описание отредактировано 25.09.2017 15:38

Подробнее

Palacio Alan. Distributed Data Systems with Azure Databricks: Create, deploy, and manage enterprise data pipelines

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2021. — 414 p. — ISBN 978-1838647216. Quickly build and deploy massive data pipelines and improve productivity using Azure Databricks Key Features Get to grips with the distributed training and deployment of machine learning and deep learning models Learn how ETLs are integrated with Azure Data Factory and Delta Lake Explore deep learning and machine learning...

№65
17,84 МБ
добавлен 26.05.2021 08:32
описание отредактировано 22.06.2021 15:02

Подробнее

Риза С., Лезерсон У., Оуэн Ш., Уиллс Д. Spark для профессионалов: современные паттерны обработки больших данных

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

СПб.: Питер, 2017. — 272 с. В этой практичной книге четверо специалистов Cloudera по анализу данных описывают самодостаточные паттерны для выполнения крупномасштабного анализа данных при помощи Spark. Авторы комплексно рассматривают Spark, статистические методы и множества данных, собранные в реальных условиях, и на этих примерах демонстрируют решения распространенных...

№66
5,62 МБ
добавлен 13.12.2017 17:44
описание отредактировано 11.01.2018 13:23

Подробнее

Svaljek M. Spark Succinctly

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Syncfusion Inc., 2015. — 111 p. Mastering big data requires an aptitude at every step of information processing. Post-processing, one of the most important steps, is where you find Apache Spark frequently employed. Spark Succinctly, by Marko Švaljek, addresses Spark’s use in the ultimate step in handling big data. Topics included: - Introduction - Installing Spark - Hello...

№67
3,40 МБ
добавлен 12.01.2016 20:57
описание отредактировано 16.06.2017 19:29

Подробнее

Tellez Alex. Mastering Machine Learning with Spark 2.x (code)

archive
pdf
txt

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2017. — 323 p. — ISBN: 978-1-78528-345-1. Unlock the complexities of machine learning algorithms in Spark to generate useful data insights through this data analysis tutorial The purpose of machine learning is to build systems that learn from data. Being able to understand trends and patterns in complex data is critical to success; it is one of the key...

№68
36,79 МБ
добавлен 12.03.2018 01:53
описание отредактировано 12.03.2018 02:23

Подробнее

Gourav Gupta, Dr. Manish Gupta, Dr. Inder Singh Gupta Practical Machine Learning with Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

BPB Publications, 2022. — 554 p. — ISBN: 978-93-91392-086. This book provides the reader with an up-to-date explanation of Machine Learning and an in-depth, comprehensive, and straightforward understanding of the architectural techniques used to evaluate and anticipate the futuristic insights of data using Apache Spark.

№69
18,04 МБ
добавлен 03.07.2022 15:04
описание отредактировано 03.07.2022 22:37

Подробнее

Ilijason R. Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Apress, 2020. — 281 p. — ISBN: 9781484257814. Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics...

№70
2,80 МБ
добавлен 11.06.2020 17:07
описание отредактировано 12.06.2020 03:11

Подробнее

Ramaswami Y. Time Series Analysis with Spark: A practical guide to processing, modeling, and forecasting time series with Apache Spark

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Birmingham: Packt Publishing, 2025. — 292 p. — ISBN 1803232250. Master the fundamentals of time series analysis with Apache Spark and Databricks and uncover actionable insights at scale. Key Features Quickly get started with your first models and explore the potential of Generative AI. Learn how to use Apache Spark and Databricks for scalable time series solutions. Establish...

№71
18,70 МБ
добавлен 24.04.2025 18:15
описание отредактировано 24.04.2025 22:29

Подробнее

Shaerif A., Ravihandra A. Apache Spark Deep Learning Cookbook

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2018. — 474 p. — ISBN: 978-1788474221. A solution-based guide to put your deep learning models into production with the power of Apache Spark Key Features Discover practical recipes for distributed deep learning with Apache Spark Learn to use libraries such as Keras and TensorFlow Solve problems in order to train your deep learning models on Apache Spark Book...

№72
47,05 МБ
добавлен 12.08.2018 13:46
описание отредактировано 23.08.2018 12:49

Подробнее

Iozzia G. Hands-On Deep Learning with Apache Spark: Build and deploy distributed deep learning applications on Apache Spark (code)

image
pdf
txt

Раздел: Распределенные вычисления и системы → Apache Spark

Packt, 2019 — 322 p. — ISBN: 1788994613. Speed up the design and implementation of deep learning solutions using Apache Spark Deep learning is a subset of machine learning where datasets with several layers of complexity can be processed. Hands-On Deep Learning with Apache Spark addresses the sheer complexity of technical and analytical parts and the speed at which deep...

№73
2,81 МБ
добавлен 10.03.2019 14:03
описание отредактировано 11.03.2019 09:39

Подробнее

Quddus Jillur. Machine Learning with Apache Spark Quick Start Guide: Uncover patterns, derive actionable insights, and learn from big data using MLlib

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2019. — 233 p. — ISBN: 978-1-78934-656-5. Combine advanced analytics including Machine Learning, Deep Learning Neural Networks and Natural Language Processing with modern scalable technologies including Apache Spark to derive actionable insights from Big Data in real-time Every person and every organization in the world manages data, whether they realize it or...

№74
9,37 МБ
добавлен 22.01.2020 03:34
описание отредактировано 09.03.2020 16:54

Подробнее

Quddus Jillur. Machine Learning with Apache Spark Quick Start Guide: Uncover patterns, derive actionable insights, and learn from big data using MLlib (Code Files)

image
pdf
txt

Раздел: Распределенные вычисления и системы → Apache Spark

Packt Publishing, 2019. — 233 p. — ISBN: 978-1-78934-656-5. Code files only! Combine advanced analytics including Machine Learning, Deep Learning Neural Networks and Natural Language Processing with modern scalable technologies including Apache Spark to derive actionable insights from Big Data in real-time Every person and every organization in the world manages data, whether...

№75
9,18 МБ
добавлен 22.01.2020 03:36
описание отредактировано 09.03.2020 16:54

Подробнее

Карау Х., Уоррен Р. Эффективный Spark. Масштабирование и оптимизация

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

СПб.: Питер, 2018. — 352 с.: ил. — (Бестселлеры O’Reilly). — ISBN: 978-5-4461-0705-6. Если у вас уже есть положительный опыт использования Spark для решения небольших задач, но вы по-прежнему ломаете голову – где та самая непревзойденная производительность Spark, позволяющая перемалывать колоссальные объемы данных – то эта книга для вас. Она расскажет, как эффективно...

№76
7,29 МБ
добавлен 29.08.2019 16:25
описание отредактировано 30.08.2019 15:37

Подробнее

Карау Холден, Конвински Энди, Венделл Патрик, Захария Матей. Изучаем Spark. Молниеносный анализ данных

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

М.: ДМК Пресс, 2015. — 304 с. Объем обрабатываемых данных во всех областях человеческой деятельности продолжает расти быстрыми темпами. Существуют ли эффективные приемы работы с ним? В этой книге рассказывается об Apache Spark, открытой системе кластерных вычислений, которая позволяет быстро создавать высокопроизводительные программы анализа данных. С помощью Spark вы сможете...

№77
15,68 МБ
добавлен 14.10.2017 12:44
описание отредактировано 15.10.2017 03:47

Подробнее

Перрен Жан-Жорж. Spark в действии

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

М.: ДМК Пресс, 2020. — 636 с. — ISBN: 978-5-97060-879-1. В этой книге подробно рассматривается организация обработки больших данных с использованием аналитической операционной системы Apache Spark. Тщательно описываются процессы потребления, преобразования и публикации результатов обработки данных; продемонстрированы возможности Apache Spark при работе с разнообразными...

№78
23,11 МБ
добавлен 27.12.2020 05:24
описание отредактировано 27.12.2020 06:25

Подробнее

Перрен Жан-Жорж. Spark в действии: С примерами на Java, Python и Scala

pdf

Раздел: Распределенные вычисления и системы → Apache Spark

М.: ДМК Пресс, 2021. — 637 c. — ISBN 978-5-97060-879-1. Анализ корпоративных данных начинается с чтения, фильтрации и объединения файлов и потоков из многих источников. Механизм обработки данных Spark способен обрабатывать эти разнообразные объемы информации как признанный лидер в этой области, обеспечивая в 100 раз большую скорость, чем например Hadoop. Благодаря поддержке...

№79
14,47 МБ
добавлен 25.02.2025 00:38
описание отредактировано 25.02.2025 03:08

Комментарии