Browse by Tags: data science - EdShare Southampton

Up a level

Number of items: 24.

Allocation of students into groups CW3 COMP6235
Updated with students who joined the course later

Shared with the University by
Prof Elena Simperl
Allocation of topics to group projects COMP6235 CW3
Shared with the University by
Prof Elena Simperl
COMP6234 Week 1
Introduction to the module, fundamentals, history of data visualisation

Shared with the University by
Prof Elena Simperl
COMP6234 Week 6
Visual perception, information design for the brain, discussion of good and bad visualisations

Shared with the University by
Prof Elena Simperl
COMP6235 Introduction and fundamentals (I)
Fundamentals of data science and introduction to COMP6235

Shared with the University by
Prof Elena Simperl
COMP6235 Introduction and fundamentals (II)
Fundamentals of data science and introduction to COMP6235 (continued)

Shared with the University by
Prof Elena Simperl
Consider the Source: In Whose Interests, and How, of Big, Small and Other Data? Exploring data science through wellth scenarios.
We're not a particularly healthy culture. Our "normal" practices are not optimised for our wellbeing. From the morning commute to the number of hours we believe we need to put in to complete a task that may itself be unreasonable, to the choices we make about time to prepare food to fit into these constraints - all these operations tend to make us feel forced into treating ourselves as secondary to our jobs. How can data help improve our quality of life? FitBits and AppleWatches highlight the strengths and limits of Things that Count, not the least of which is the rather low uptake of things like FITBITS and apple watches. So once we ask the question about how data might improve quality of life, we may need to add the caveat: pervasively, ubiquitously, in the rich variety of contexts that isn't all about Counting. And once we think about such all seeing all knowing environments, we then need to think about privacy and anonymity. That is: does everything have to be connected to the internet to deliver on a vision of improved quality of life through data? And if there is a Big Ubiquity - should we think about inverting new norms, like how to make personal clouds and personal data stores far more easy to manage - rather than outsourcing so much data and computation? In this short talk, I'd like to consider three scenarios about Going where too few humans have gone before to help others The challenges of qualitative data Supporting privacy and content to motivate thinking about data capture, re-use and re-presentation, and opportunities across ECS for machine learning, AI, infoviz and hci.

Shared with the University by
Ms Amber Bu
Data Observatories
Abstract: The Data Observatory (DO) at the Data Science Institute (DSI) in Imperial College (IC) is the largest interactive visualisation facility in Europe, consisting of 64 monitors arranged to give 313 degree surround vision and engagement Opened in November 2015, the DO provides an opportunity for academics and industry to visualise data in a way that uncovers new insights, and promotes the communication of complex data sets and analysis in an immersive and multi-dimensional environment. Designed, built by, and housed within the DSI, the DO enables decision makers to derive new implications and actions from interrogating data sets in an innovative, unique environment. The talk will provide an overview of the DO capabilities and case studies of its use.

Shared with the University by
Ms Amber Bu
Data Science MSc - Introduction to Data Science. MongoDB Tutorial
Shared with the University by
Dr Ramine Tinati
Data Science MSc. Data Collection, Pre-Processing, and Mining
Shared with the University by
Dr Ramine Tinati
Data Science MSc. Tutorial on Data Collection from Twitter using NodeJS
Shared with the University by
Dr Ramine Tinati
Data Science MSc. Web Observatories for Data Science
Shared with the University by
Dr Ramine Tinati
Data Science Seminar: Generic Big Data Processing for Advancing Situation Awareness and Decision-Support
The generation of heterogeneous big data sources with ever increasing volumes, velocities and veracities over the he last few years has inspired the data science and research community to address the challenge of extracting knowledge form big data. Such a wealth of generated data across the board can be intelligently exploited to advance our knowledge about our environment, public health, critical infrastructure and security. In recent years we have developed generic approaches to process such big data at multiple levels for advancing decision-support. It specifically concerns data processing with semantic harmonisation, low level fusion, analytics, knowledge modelling with high level fusion and reasoning. Such approaches will be introduced and presented in context of the TRIDEC project results on critical oil and gas industry drilling operations and also the ongoing large eVacuate project on critical crowd behaviour detection in confined spaces.

Shared with the University by
Mr Roushdat Elaheebocus
Data Stories -Engaging with Data in a Post-truth World
Shared with the University by
Ms Amber Bu
Data collection and management
Shared with the University by
Prof Elena Simperl
Data science
Guest lecture COMP1205, fundamentals and applications of data science

Shared with the University by
Prof Elena Simperl
Expressiveness Benchmarking for System-level Provenance
Over the past decade a number of research prototypes that record provenance or other forms of rich audit logs at the operating system level. The last few years have seen the increasing use of such systems for security and audit, notably in DARPA's $60m investment in the Transparent Computing program. Yet the foundations for trust in such systems remains unclear; the correct behaviour of a provenance recording system has not yet been clearly specified or proved correct. Therefore, attempts to improve security through auditing provenance records may fail due to missing or inaccurate provenance, or misunderstanding the intentions of the system designers, particularly when integrating provenance records from different systems. Even worse, provenance recording systems are not even straightforward to test, because the expected behaviour is nondeterministic: running the same program at different times or different machines is guaranteed to yield different provenance graphs, and running programs with nontrivial concurrency behaviour typically also yields multiple possible provenance graphs with different structure. We believe that such systems can be formally specified and verified, and should be in order to remove complex provenance recording systems from the trusted computing base. However, formally verifying such a system seems to require first having an accepted formal model of the operating system kernel itself, which is a nontrivial undertaking. In the short term, we propose provenance expressiveness benchmarking, an approach to understanding the current behaviour of a provenance recording system. The key idea (which is simple in principle) is to generate provenance records for individual system calls or short sequences of calls, and for each one generate a provenance graph fragment that shows how the call was recorded in the provenance graph. The challenge is how to automate this process, given that provenance recording tools work in different ways, use different output formats, and generate different (but similar) graphs containing both target activity and background noise. I will present work on this problem so far, focusing on how to automate the NP-complete approximate subgraph isomorphism problems we need to solve to automatically extract benchmark results.

Shared with the University by
Ms Amber Bu
Group project COMP6235 2019 2020
Shared with the University by
Prof Elena Simperl
Human Data Interaction
Abstract: Data is everywhere. Today people are faced with the daunting task of understanding and managing the data created by them, about them, and for them, due to the lack of mechanisms between them and the data. In this talk, I will use some examples in my own research to explain how we can bridge the gap between humans and data through a series of interaction mechanisms. I will first explain how we can use agencies such as recommender systems to help people manage the access to their personal data. I will then explain how data visualisations can be used to help people extract better insights from their personal data. I will also introduce my on-going work about applying data visualisations to public data to help people make better decisions and, beyond visualisations, telling stories about data. Biodata: Yuchen Zhao is a research fellow at the Web and Internet Science Research Group (WAIS) in the school of Electronic and Computer Science (ECS), the University of Southampton. He received his Ph.D. in computer science from University of St Andrews in 2017. His research aims to understand and address the issues in human data interaction. His previous research focused on understanding privacy issues in personal data and designing agencies to help people solve those issues. His recent research has expanded to apply data visualisations and narrative visualisations to provide better insights, transparency, and engagement in public data.

Shared with the University by
Ms Amber Bu
Introduction to data science
Shared with the University by
Prof Elena Simperl
Scalable Data Integration
Information and data integration focuses on providing an integrated view of multiple distributed and heterogeneous sources of information (such as web sites, databases, peer or sensor data etc.). Through information integration all this scattered data can be combined and queried. In this talk we are dealing with the problems of data integration, data exchange/warehousing, and query answering with or without ontologies. We present an algorithm for virtual data integration where data sources are queried in a distributed way and no centralized repository is materialized. Our algorithm processes queries in the presence of thousands of data sources in under a second. We extend this solution to virtual integration settings where domain knowledge is represented using constraints/ontologies (e.g. OWL2-QL). Subsequently, we examine the Chase algorithm which is the main tool to reason with constraints for data warehousing, and develop an optimization that performs orders of magnitude faster. We also examine hybrid solutions to data integration where both materialization/warehousing and virtual data integration are combined in order to optimize query answering. We discuss how these approaches can help set up future research directions and outline important applications to data management and analysis over integrated data.

Shared with the University by
Ms Amber Bu
Sketching the vision of a Web of Debates
Web users have changed the Web from a means for publishing and exchanging documents to a means for sharing their feelings, beliefs, and opinions and participating in debates on any conceivable topic. Current web technologies fail to support this change: arguments and opinions are uploaded in purely textual form; as a result, they cannot be easily retrieved, processed and interlinked, and all this information is largely left unexploited. This talk will sketch the vision of Debate Web, which will enable the extraction, discovery, retrieval, interrelation and visualisation of the vast variety of viewpoints that exist online, based on machine-readable representations of arguments and opinions.

Shared with the University by
Ms Amber Bu
Spatial data integration for mapping progress towards the Sustainable Development Goals
Abstract: The UN sustainable development goals, an intergovernmental set of 17 aspirational goals and 169 targets to be achieved by 2030, were launched last year. These include ending poverty and malnutrition, improving health and education, and building resilience to natural disasters and climate change. A particular focus across the goals and targets is achievement 'everywhere', ensuring that no one gets left behind and that progress is monitored at subnational levels to avoid national-level statistics masking local heterogeneities. How will this subnational monitoring of progress towards meeting the goals be undertaken when many countries will undertake just a single census in the 2015-2030 monitoring period? Professor Tatem will present an overview of the work of the two organizations he directs; WorldPop ( www.worldpop.org ) and Flowminder ( www.flowminder.org ); in meeting the challenges of constructing consistent, comparable and regularly updated metrics to measure a! nd map progress towards the sustainable development goals in low and middle income countries, and where the integration of traditional and new forms of data, including those derived from satellite imagery, GPS and mobile phones, can play a role.

Shared with the University by
Ms Amber Bu
The Chemistry of Data
Abstract: In my talk I will discuss the way in which the ideas of the Data Science, Web and Semantic Web, Open Science contribute to new methods and approaches to data driven chemistry and chemical informatics. A key aspect of the discussion will be how to facilitate the improved acquisition and integration and analysis of chemical data in context. I will refer to lesions learnt in the e-Science and Digital Economy (particularly the IT as a Utility Network) programmes and the EDISON H2020 project. Jeremy G. Frey Jeremy Frey obtained his DPhil on experimental and theoretical aspects of van der Waals complexes, in Oxford, followed by a fellowship at the Lawrence Berkeley Laboratory with Yuan Lee. In 1984 he joined the University of Southampton, where he is now Professor of Physical Chemistry and head of the Computational Systems Chemistry Group. His experimental research probes molecular organization from single molecules to liquid interfaces using laser spectroscopy from the IR to soft X-rays. In parallel he investigates how e-Science infrastructure supports intelligent access to scientific data. He is strongly committed to collaborative inter and multi-disciplinary research and is skilled in facilitating communication between diverse disciplines speaking different languages. He has successfully lead several large interdisciplinary collaborative RUCK research grants, from Basic Technology (Coherent Soft X-Ray imaging), e-Science (CombeChem) and most recently the Digital Economy Challenge area of IT as a Utility Network+, where he has successfully created a unique platform to facilitate collaboration across the social, science, engineering and design domains, working with all the research, commercial, third and governmental sectors.

Shared with the University by
Ms Amber Bu

This list was generated on Sat Feb 28 22:05:31 2026 UTC.

Export as	Atom RSS 1.0 RSS 2.0