Data science

[ follow ]
#snowflake
Data science
fromHackernoon
1 year ago

How a Startup Using Gremlin Beat Everyone to Google's Door | HackerNoon

Google's acquisition of Wiz for $32 billion signifies a decisive victory in the cloud security sector.
Data science
fromIT Pro
14 hours ago

Are geothermal data centers just hot air?

Geothermal energy is a reliable renewable source for powering large-scale data centers, particularly for high-density AI workloads.
Data science
fromTechzine Global
14 hours ago

Scale Computing and Veeam now deliver full backup integration

Veeam's backup software integrates with Scale Computing's virtualization platform, enabling agentless hypervisor backup.
fromHackernoon
2 months ago

5 Major Business Mistakes When Working with Big Data: Lessons from a Company Managing 16 TB of Data | HackerNoon

Over a quarter of data and analytics professionals worldwide estimate that poor-quality data costs companies over $5 million annually, with 7% putting the figure at $25 million or more.
Data science
fromInfoWorld
1 day ago

Google updates agents in BigQuery to further automate analytics tasks

Google enhances BigQuery with a new code interpreter and advanced analytics features, improving automation in data engineering and data science tasks.
#data-centers
fromInfoWorld
2 days ago

Apache Flink integrates AI for real-time decision-making

With the 2.1 release, Apache Flink also now supports Process Table Functions (PTFs), the most powerful kind of function for Flink SQL and Table API.
Data science
fromMarTech
1 week ago

Messy data is your secret weapon - if you know how to use it | MarTech

Recent advances in AI enable effective analysis of messy, unstructured data, challenging the long-held belief that data must be clean.
#data-management
fromInfoQ
1 week ago
Data science

Building Reproducible ML Systems with Apache Iceberg and SparkSQL: Open Source Foundations

fromInfoQ
1 week ago
Data science

Building Reproducible ML Systems with Apache Iceberg and SparkSQL: Open Source Foundations

fromInfoWorld
1 year ago

What is Microsoft Fabric? A big tech stack for big data

Microsoft Fabric is a comprehensive cloud-based analytics suite integrating various Microsoft components for diverse roles.
Data science
fromMedium
4 weeks ago

Scaling AI Responsibly: Lessons in Efficiency, Flexibility, and Platform Design

AI tooling development must prioritize speed and user-centric solutions to drive real-world impact.
fromDevOps.com
2 weeks ago

StarTree Bridges the Lakehouse Gap: Serving Apache Iceberg Data Directly to Applications - DevOps.com

This introduces latency, complexity and what we call 'bloat,' explains Chad Meley, SVP of Marketing at StarTree. We're collapsing that serving and query layer into one piece of the puzzle, significantly reducing the bloat and simplifying that architecture.
Data science
#data-integration
fromHackernoon
1 year ago
Data science

A Developer's Guide to SeaTunnel and Hive Integration with Real-World Configs | HackerNoon

fromHackernoon
1 year ago
Data science

A Developer's Guide to SeaTunnel and Hive Integration with Real-World Configs | HackerNoon

Data science
fromTechzine Global
2 weeks ago

Comeback of LTO tape: market grew significantly in 2024

LTO tape market experienced significant growth in 2024, with 176.5 exabytes of compressed capacity introduced, marking a 15.4% increase from 2023.
Data science
fromNew Relic
2 weeks ago

Database Performance Monitoring - Now GA: Deep Query Analysis

Enhanced Database Performance Monitoring enables direct query-level insights, improving DBAs' ability to manage database performance.
fromTechzine Global
2 weeks ago

Ataccama underlines AI data lineage for business users

Ataccama closes that gap by turning complex data logic into plain language. Business users can now trace a data point's origin and understand how it was profiled or flagged without relying on technical experts.
Data science
#ai
fromInfoWorld
3 weeks ago
Data science

Orchestrating AI-driven data pipelines with Azure ADF and Databricks: An architectural evolution

fromInfoWorld
3 weeks ago
Data science

Orchestrating AI-driven data pipelines with Azure ADF and Databricks: An architectural evolution

fromHackernoon
4 months ago

Redefining Data Operations With Data Flow Programming in CocoIndex | HackerNoon

In traditional systems, side effects lead to increased complexity, debugging challenges, and unpredictable behavior. CocoIndex adopts a pure data flow programming approach, ensuring reliability.
Data science
fromHackernoon
3 weeks ago

Effective Data Chunking and Querying with Pinecone and GPT-4o | HackerNoon

Optimizing data ingestion in Pinecone involves preprocessing markdown and splitting articles into fixed-length chunks for improved relevance.
Data science
fromInfoWorld
1 year ago

Snowflake updates developer tools, adds observability features

Snowflake introduces Trail for enhanced observability in data management workflows.
#data-analytics
fromMedium
3 weeks ago
Data science

The Data Science Playbook: Exploring Sports Analytics Through Real Datasets

Data analytics has become central to competitive advantage in sports, influencing coaching, player evaluation, and fan experience.
fromMedium
3 weeks ago
Data science

The Data Science Playbook: Exploring Sports Analytics Through Real Datasets

fromHackernoon
2 years ago

Why No Single Algorithm Solves Deduplication - and What to Do Instead | HackerNoon

Detecting duplicate entities at scale requires efficient methods to reduce comparisons and maintain high recall.
fromInfoWorld
1 year ago

What's new in MySQL 9.0

MySQL 9.0.0 introduces a new Vector datatype, JavaScript Stored Programs, updated library versions, and enhancements to the Event Scheduler, while deprecating old SHA-1 security.
Data science
fromTearsheet
4 weeks ago

Announcing the winners of Tearsheet's 2025 Data Awards - Tearsheet

Data and data sharing are fundamental to modern finance, with ecosystems built around customer information.
fromTechCrunch
1 month ago

AI is forcing the data industry to consolidate - but that's not the whole story | TechCrunch

There is a complete reset in how data is managed and flows around the enterprise. If people want to seize the AI imperative, they have to redo their data platforms in a very big way. And this is where I believe you're seeing all these data acquisitions, because this is the foundation to have a sound AI strategy.
Data science
fromInfoQ
1 month ago

Databricks Contributes Spark Declarative Pipelines to Apache Spark

Databricks is contributing the technology behind Delta Live Tables (DLT) to the Apache Spark project as Spark Declarative Pipelines, simplifying the development of streaming pipelines.
Data science
fromClickUp
1 month ago

Venn Diagram Alternatives for Data Visualization in 2025 | ClickUp

Venn diagrams use overlapping circles to show the relationship between two or more things, facilitating comparisons across various fields.
Data science
Data science
fromHackernoon
4 years ago

What If Your 'Messy' Data Is Actually Perfect? | HackerNoon

Success Metrics layer guides transformation by defining what success looks like and how to recognize achievement.
fromTheregister
1 month ago

Coming to PostgreSQL - on-disk database encryption

Percona is providing Transparent Data Encryption (TDE) for PostgreSQL to enhance database security, helping customers meet compliance requirements without licensing fees or restrictions.
Data science
fromIT Pro
1 month ago

How can businesses handle data sprawl?

Data sprawl and content sprawl create significant challenges for organizations due to unstructured data growth and lack of governance.
Data science
fromHackernoon
2 years ago

Deep Dive into MS MARCO Web Search: Unpacking Dataset Characteristics | HackerNoon

The MS MARCO dataset reveals considerable multilingual disparity and significant data skew, highlighting challenges in model evaluation and training.
fromHackernoon
1 month ago

How to Write Complex Queries in Apache Spark SQL Using CTE (WITH Clause) | HackerNoon

A Common Table Expression (CTE) is a named, temporary result set defined within a single SQL statement, which helps in improving query readability and maintainability.
Data science
Data science
fromESPN.com
1 month ago

NHL draft grades: From the excellent (Islanders, Hurricanes) to the confusing (Maple Leafs)

The 2025 NHL draft faced criticism for its lengthy process and decentralization voting, emphasizing a return to centralized drafting.
fromMedium
1 month ago

Frequent Spark Interview QuestionsPart 2

Both cache() and persist() store an RDD/DataFrame/Dataset in memory (or disk) to avoid recomputation. cache() is shorthand for persist(StorageLevel.MEMORY_ONLY), while persist() offers more control.
Data science
Data science
fromDevOps.com
1 month ago

DataOps and Automation: The Future of Database Management - DevOps.com

Implementing DataOps can significantly enhance deployment velocity by automating database operations, reducing errors and manual delays.
Data science
fromTheregister
1 month ago

A trip through vintage datacenter networking

The evolution of datacenter networking has transformed from proprietary systems to complex modern technologies.
Early networking was defined by compatibility issues and manufacturer-specific protocols.
fromInfoWorld
1 month ago

Teradata aims to simplify on-premises AI for data scientists with AI Factory

Teradata's AI Factory simplifies on-prem AI lifecycle management, reducing reliance on hybrid solutions and improving data sovereignty.
#apache-spark
fromMedium
1 month ago
Data science

RDD vs DataFrame vs Dataset in Apache Spark: Which One Should You Use and Why

fromMedium
1 month ago
Data science

RDD vs DataFrame vs Dataset in Apache Spark: Which One Should You Use and Why

fromwww.theguardian.com
1 month ago

Antarctic ice has grown again but this does not buck overall melt trend

Antarctic ice gained mass from 2021 to 2023, showing climate change follows a jagged path with temporary gains amid long-term losses.
Data science
fromTalkpython
1 month ago

From Notebooks to Production Data Science Systems

She emphasized the idea that moving from exploratory data analysis in Jupyter notebooks to production involves not just technical skills, but also leveraging software engineering principles.
Data science
Data science
fromMedium
1 month ago

Announcing the ODSC West 2025 Call for Speakers

ODSC West 2025 is inviting speakers to share insights in various data science and AI topics.
A diverse audience of data science professionals will attend the conference.
Speakers will benefit from networking opportunities and perks including a conference pass.
Data science
fromMedium
1 month ago

Empowering Secure AI with Open-Source LLMs and Compute-Over-Data

Organizations can leverage LLMs securely and efficiently by using open-source models to maintain data privacy.
fromSimplilearn.com
2 years ago

What is XGBoost? An Introduction to XGBoost Algorithm in Machine Learning | Simplilearn

XGBoost is an open-source library that can train and test models on large amounts of data.
It is used to predict ad click-through rates and classify high-energy physics events.
fromComputerWeekly.com
1 month ago

Interview: Pure Storage on the AI data challenge beyond hardware | Computer Weekly

Data quality is critical for successful AI workloads, necessitating proper data management and preparation before computational resources are utilized.
fromRealpython
1 month ago

Starting With DuckDB and Python - Real Python

DuckDB provides a powerful, seamless way to manage large datasets in Python, utilizing OLAP optimization for enhanced data handling and query capabilities.
Data science
Data science
fromNature
1 month ago

We need to predict the people disasters will hit, not just the places

Local authorities need to identify at-risk populations in disaster areas to save lives effectively.
fromwww.bbc.com
1 month ago

Notts boss Paterson on data and management by committee

Martin Paterson embraces a collaborative approach as head coach at Notts County, prioritizing football expertise over data analytics.
Data science
fromNature
1 month ago

Will Gates and other funders save massive public health database at risk from Trump cuts?

The termination of the DHS program threatens global health data collection and monitoring, impacting health policy and community well-being. Ultimately, funding is critical.
Data science
fromNature
1 month ago

Medical AI can transform medicine - but only if we carefully track the data it touches

Advanced machine learning can enhance early detection in medicine, but uncertainty in predictions remains a challenge.
from24/7 Wall St.
1 month ago

Snowflake (NYSE: SNOW) Price Prediction and Forecast 2025-2030 (June 2025)

Shares of Snowflake Inc. surged 6.56% in the past month, achieving a year-to-date gain of 70.82%, with Q1 revenue exceeding $1 billion for the first time.
Data science
fromWIRED
1 month ago

India Is Using AI and Satellites to Map Urban Heat Vulnerability Down to the Building Level

Remote-sensing data and AI are being utilized to identify heat-vulnerable buildings in cities like Delhi, targeting efforts to provide relief during extreme temperatures.
Data science
Data science
fromSimplilearn.com
5 years ago

A Step-by-Step Guide for a Smooth Career Transition to Data Science

Looking forward to a career transition to Data science?
Your existing software engineer's skills would make a great asset in the data science field.
Learn how, Click here!
fromSimplilearn.com
3 years ago

Top U.S. Data Scientist Salaries in 2025 | Simplilearn

Our world generates more data than ever.
Hence, demand for people who can work with data will keep growing.
U.S. data scientist salaries can vary by state and company.
fromHackernoon
1 year ago

Are Judeo-Christian Values the Foundation of American Democracy? | HackerNoon

There are some that claim the US Constitution is a product of a Judeo-Christian culture, asserting that democracy matured due to a Christian influence.
Data science
fromwww.npr.org
1 month ago

Greetings from Shenyang, China, where workers sort AI data in 'Severance'-like ways

Cities like Shenyang, once reliant on declining industries, are reinventing themselves by focusing on new tech initiatives, particularly in AI data processing to create new jobs.
Data science
Data science
fromTalkpython
1 month ago

10 Polars Tools and Techniques To Level Up Your Data Science

Polars offers numerous advantages over Pandas, especially when enhanced with tailored libraries.
Data science
fromLos Angeles Times
1 month ago

'We are still here, yet invisible.' Study finds that U.S. government has overestimated Native American life expectancy

Official U.S. records greatly underestimate mortality and life expectancy disparities for Native Americans, revealing serious discrepancies in health statistics.
Data science
fromHackernoon
4 months ago

The 5 Ingenious Data Structures (and What They Actually Do) | HackerNoon

Understanding the foundational data structures is essential for effective programming.
Specialized data structures address unique challenges faced with larger and more complex datasets.
fromeLearning Industry
1 month ago

Data-Driven L&D: Building Real-Time Learning Analytics Dashboards With No-Code

No-code analytics dashboards enhance Learning and Development (L&D) by providing real-time, actionable insights to improve training outcomes.
fromInfoWorld
1 month ago

Understanding how data fabric enhances data security and governance

Data fabric simplifies data management across fragmented environments, enhancing security and governance.
fromHackernoon
1 month ago

The Data Science Behind r/antiwork's Upvotes | HackerNoon

The dataset for our analysis was shaped by filtering out potentially biased comments, ensuring that the final set was representative and valid for our study.
Data science
Data science
fromInfoQ
1 month ago

HTAP: The Rise and Fall of Unified Database Systems?

HTAP has not achieved its goal of unifying transaction and analytical processing, leading experts to prefer specialized systems.
Data science
fromHackernoon
55 years ago

Postgres and the Lakehouse Are Becoming One System - Here's What Comes Next | HackerNoon

Modern data systems are blending Postgres with lakehouse technologies for enhanced data management and analytics.
Data science
fromThe Verge
1 month ago

Google has a new AI model and website for forecasting tropical storms

Google's new AI model forecasts tropical cyclones more accurately than traditional models, promising improved storm tracking and preparation.
Data science
fromInfoWorld
1 month ago

Use geospatial data in Azure with Planetary Computer Pro

Microsoft's Planetary Computer provides extensive geospatial data tools for researchers, leveraging data for machine learning and insights into environmental studies.
Data science
fromMedium
1 month ago

Showcasing the Future of Time Series Forecasting with Foundation Models

Foundation models are transforming time series forecasting, offering efficiency and adaptability across various sectors with advanced AI techniques.
Data science
fromHackernoon
1 month ago

The Future of Remote Sensing: Few-Shot Learning and Explainable AI | HackerNoon

Few-shot learning techniques for remote sensing enhance model efficiency with limited data, emphasizing the need for explainable AI.
Data science
fromHackernoon
2 months ago

Why Data Lies (and Your Model Might Too): The Curious Case of Simpson's Paradox | HackerNoon

Simpson's Paradox demonstrates how combining data can misrepresent trends, as marginal probabilities can obscure subgroup dynamics.
[ Load more ]