Overview
This unit focuses on the foundational concepts of data science. Digital data is growing at a very fast rate with data being the underlying driver of the knowledge economy. This unit will prepare you with foundational knowledge and practical skills about data collection, representation, storage, retrieval, management, analysis, and visualisation through the exploration of data-related challenges. You will also learn the impact of big data and business analytics on business performance to cater for the development of useful information and knowledge in an attempt to achieve data-driven decision making.
Details
Pre-requisites or Co-requisites
Prerequisite: COIT11226 Systems Analysis Co-requisite: COIT11237 Database Design & Implementation
Important note: Students enrolled in a subsequent unit who failed their pre-requisite unit, should drop the subsequent unit before the census date or within 10 working days of Fail grade notification. Students who do not drop the unit in this timeframe cannot later drop the unit without academic and financial liability. See details in the Assessment Policy and Procedure (Higher Education Coursework).
Offerings For Term 3 - 2024
Attendance Requirements
All on-campus students are expected to attend scheduled classes - in some units, these classes are identified as a mandatory (pass/fail) component and attendance is compulsory. International students, on a student visa, must maintain a full time study load and meet both attendance and academic progress requirements in each study period (satisfactory attendance for International students is defined as maintaining at least an 80% attendance record).
Recommended Student Time Commitment
Each 6-credit Undergraduate unit at CQUniversity requires an overall time commitment of an average of 12.5 hours of study per week, making a total of 150 hours for the unit.
Class Timetable
Assessment Overview
Assessment Grading
This is a graded unit: your overall grade will be calculated from the marks or grades for each assessment task, based on the relative weightings shown in the table above. You must obtain an overall mark for the unit of at least 50%, or an overall grade of 'pass' in order to pass the unit. If any 'pass/fail' tasks are shown in the table above they must also be completed successfully ('pass' grade). You must also meet any minimum mark requirements specified for a particular assessment task, as detailed in the 'assessment task' section (note that in some instances, the minimum mark for a task may be greater than 50%). Consult the University's Grades and Results Policy for more details of interim results and final grades.
All University policies are available on the CQUniversity Policy site.
You may wish to view these policies:
- Grades and Results Policy
- Assessment Policy and Procedure (Higher Education Coursework)
- Review of Grade Procedure
- Student Academic Integrity Policy and Procedure
- Monitoring Academic Progress (MAP) Policy and Procedure - Domestic Students
- Monitoring Academic Progress (MAP) Policy and Procedure - International Students
- Student Refund and Credit Balance Policy and Procedure
- Student Feedback - Compliments and Complaints Policy and Procedure
- Information and Communications Technology Acceptable Use Policy and Procedure
This list is not an exhaustive list of all University policies. The full list of University policies are available on the CQUniversity Policy site.
Feedback, Recommendations and Responses
Every unit is reviewed for enhancement each year. At the most recent review, the following staff and student feedback items were identified and recommendations were made.
Feedback from Unit Evaluation
Practice materials on the R language were insufficient.
Provide additional materials on R langauge.
Feedback from Unit Evaluation
Big data lecture materials (i.e., Hadoop) were not updated.
Review the lecture slides on big data, and revamp the lecture materials.
- Discuss and demonstrate data science foundational concepts
- Investigate and evaluate applications for data storage, management, retrieval, and analysis and visualisation
- Apply knowledge to process data for data driven decision making
- Analyse and generate solutions to solve data-related challenges
- Demonstrate the knowledge required in using data science skills to solve business problems.
Australian Computer Society (ACS) recognises the Skills Framework for the Information Age (SFIA). SFIA is in use in over 100 countries and provides a widely used and consistent definition of ICT skills. SFIA is increasingly being used when developing job descriptions and role profiles.
ACS members can use the tool MySFIA to build a skills profile at https://www.acs.org.au/professionalrecognition/mysfia-b2c.html
This unit contributes to the following workplace skills as defined by SFIA. The SFIA code is included:
Data Management (DATM)
Business Analysis (BUAN)
Data Analysis (DTAN)
IT Operation (ITOP)
Alignment of Assessment Tasks to Learning Outcomes
Assessment Tasks | Learning Outcomes | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
1 - Practical Assessment - 40% | |||||
2 - Written Assessment - 40% | |||||
3 - Presentation - 20% |
Alignment of Graduate Attributes to Learning Outcomes
Graduate Attributes | Learning Outcomes | ||||
---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | |
1 - Communication | |||||
2 - Problem Solving | |||||
3 - Critical Thinking | |||||
4 - Information Literacy | |||||
5 - Team Work | |||||
6 - Information Technology Competence | |||||
7 - Cross Cultural Competence | |||||
8 - Ethical practice | |||||
9 - Social Innovation | |||||
10 - Aboriginal and Torres Strait Islander Cultures |
Alignment of Assessment Tasks to Graduate Attributes
Assessment Tasks | Graduate Attributes | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
1 - Practical Assessment - 40% | ||||||||||
2 - Written Assessment - 40% | ||||||||||
3 - Presentation - 20% |
Textbooks
Amazon Web Services in Action
Third Edition (March 2023)
Authors: Andreas Wittig and Michael Wittig
Manning publishers
ISBN: 9781633439160
Please source the third edition of this book which is latest and was printed in March 2023. Versions available in CQU library are older versions. Below is the link for this book.
https://www.manning.com/books/amazon-web-services-in-action-third-edition
Please source the third edition of this book which is latest and was printed in March 2023. Versions available in CQU library are older versions. Below is the link for this book.
https://www.manning.com/books/amazon-web-services-in-action-third-edition
Data Engineering with AWS
Second Edition (2023)
Authors: Gareth Eagar
Packt Publishing
ISBN: 9781804614426
Data Science on AWS
(2021)
Authors: Chris Fregly, Antje Barth
O'Reilly Media, Inc.
ISBN: 9781492079392
Below is the link for this below.
https://learning.oreilly.com/library/view/data-science-on/9781492079385/
Below is the link for this below.
https://learning.oreilly.com/library/view/data-science-on/9781492079385/
Practical Data Science with Hadoop and Spark: Designing and Building Effective Analytics at Scale
(2016)
Authors: Mendelevitch, O, Stella, C & Eadline, D
Addison-Wesley Professional
ISBN: 9780134024141
IT Resources
- CQUniversity Student Email
- Internet
- Unit Website (Moodle)
- Zoom capacity (webcam and microphone) will be required for online students
- Python 3.10 (or higher)
- Anaconda 2023 or latest
- Jupyter Notebook 7 or latest
- Spyder integrated development environment (IDE) latest version
- RStudio (IDE) and R
All submissions for this unit must use the referencing style: Harvard (author-date)
For further information, see the Assessment Tasks.
a.jayal@cqu.edu.au
Module/Topic
Introduction to Data Science: What is data science; data domination; innovation from internet giants; data science history; data science in modern enterprises; soft skills of a data scientist; data science project life cycle; types of data; big data; how is big data different.
Chapter
Events and Submissions/Topic
Module/Topic
Identifying Data Problems: From business problems to data mining tasks; data mining tasks; data collection; business use cases; sampling; data mining process.
Chapter
Events and Submissions/Topic
Module/Topic
Hadoop and Data Science: Storage requirements; what is Hadoop; Hadoop's evolution; Hadoop tools for data science;
Chapter
Events and Submissions/Topic
Module/Topic
Data Presentation: Understand different ways of summarizing data; choose the right table/ graph for the right data and audience; self explanatory graphics; attractive graphs and tables.
Chapter
Events and Submissions/Topic
Module/Topic
Data Analytics: Why analytics; different types of analytics; delivery methods for the operational users; holistic approach to expand enterprise analytics; value of integration and data quality to analytics.
Chapter
Events and Submissions/Topic
Module/Topic
Exploratory Analysis: trend analysis; Box plot; pairs plot; time series decomposition; geographical analysis.
Chapter
Events and Submissions/Topic
Assessment 1 - Practical Assessment (40% weighting) - Due week 6, Friday, 11:45 pm AEDT
Practical Assessment Due: Week 6 Friday (13 Dec 2024) 11:45 pm AEST
Module/Topic
Data Discovery and Data Mining: Data driven decisions; enabling data driven innovations; knowledge discovery process; data cleaning; data integration; data selection; data transformation; knowledge based systems; data mining and its goals;data mining operation and process.
Chapter
Events and Submissions/Topic
Module/Topic
Chapter
Events and Submissions/Topic
Module/Topic
Chapter
Events and Submissions/Topic
Module/Topic
Analytic Algorithms: clustering analysis; regression analysis; classifier analysis; association analysis; cohort analysis; graph analysis; traverse pattern analysis.
Chapter
Events and Submissions/Topic
Module/Topic
Data Integration: Analytic data integration; challenges in data integration; technologies in data integration; data mapping; data staging; data extraction; data transformation; data loading; need for integration; data integration approaches.
Chapter
Events and Submissions/Topic
Module/Topic
Data Security and Privacy: protection of personal data; data collection and significant risks; challenges of big data for data protection; confidentiality; integrity; availability; middleware security concerns; built in database protection; privacy issues; data security and storage; identification and authentication.
Chapter
Events and Submissions/Topic
Module/Topic
System Design: An overview of ML systems in the real world
Chapter
Events and Submissions/Topic
Written Assessment (40% weighting) - Due week 11, Friday, 11:45 pm AEDT
Written Assessment Due: Week 11 Friday (31 Jan 2025) 11:45 pm AEST
Module/Topic
Cloud Computing for Data Processing
Chapter
Events and Submissions/Topic
Assessment 3 - Presentation (20% weighting) - Due week 12, Friday, 11:45 pm AEDT
Presentation Due: Week 12 Friday (7 Feb 2025) 11:45 pm AEST
Module/Topic
Chapter
Events and Submissions/Topic
Unit Coordinator: Dr Ambi Jayal
Email: a.jayal@cqu.edu.au
1 Practical Assessment
This assessment is designed to reinforce the contents taught in Week 1 to Week 5. This assessment relates to Learning Outcomes 1 and 2. This assessment is an individual assessment and should be submitted in Week 6. You will submit work on the data processing exercise. This will provide you with an opportunity to learn data storage and processing. Each week you will be presented with a data-related challenge, and you will use computer tools to manipulate data to solve that challenge. This task will help to build your knowledge of data formats, retrieval, and analysis techniques. This assessment contributes to 40% of the total marks.
Week 6 Friday (13 Dec 2024) 11:45 pm AEST
Within two weeks of submission
The assessment will be marked based on the following criteria:
Quality of source code
Submitted screen shot of outputs
Analysis presented on the generated output
Well-structured and coherent report
More details will be available on the Moodle site.
- Discuss and demonstrate data science foundational concepts
- Investigate and evaluate applications for data storage, management, retrieval, and analysis and visualisation
- Communication
- Critical Thinking
- Information Literacy
- Information Technology Competence
2 Written Assessment
This assessment is based on a case study to be provided to you in teaching Week 6. You are required to write a report of 2000 words. This is an individual assessment and contributes to Learning Outcomes 2, 3, 4 and 5. This report will follow a standard business report format. You will investigate and advise an organisation, whose details are given in the case study, on data storage, retrieval, and analysis mechanisms. You will also develop an analytic dashboard for the organisation. This assessment contributes to 40% of the total marks.
Week 11 Friday (31 Jan 2025) 11:45 pm AEST
Feedback and marks for this assessment will be released after the certification date as this unit does not have an exam.
The assessment will be marked based on the following criteria:
Report formatting (font, header and footer, table of contents, numbering, referencing)
Professional communication (correct spelling, grammar, formal business language used)
Executive summary
Report introduction
Data collection and storage
Data in action
Model design and implementation
Conclusion and recommendations
More details will be available on the Moodle site.
- Investigate and evaluate applications for data storage, management, retrieval, and analysis and visualisation
- Apply knowledge to process data for data driven decision making
- Analyse and generate solutions to solve data-related challenges
- Demonstrate the knowledge required in using data science skills to solve business problems.
- Communication
- Problem Solving
- Critical Thinking
- Information Literacy
- Information Technology Competence
- Ethical practice
- Social Innovation
3 Presentation
This assessment contributes to the Learning Outcomes 3, 4 and 5. This is an individual recorded presentation. The presentation topic is based on your report from assignment 2 and learning outcomes from this unit. You will need to record and submit a 5 to 7-minute video presentation explaining the key concepts. Your recorded video should include both the presenter and your desktop within the frame. Please ensure you adhere to appropriate and professional dress codes for your presentation.
Week 12 Friday (7 Feb 2025) 11:45 pm AEST
Feedback and marks for this assessment will be released after the certification date as this unit does not have an exam.
The assessment will be marked based on the following criteria:
Stay on topic
Fulfill requirements of topic
Quality of slides
Presentation style
Valid information presented
More details will be available on the Moodle site.
- Apply knowledge to process data for data driven decision making
- Analyse and generate solutions to solve data-related challenges
- Demonstrate the knowledge required in using data science skills to solve business problems.
- Communication
- Problem Solving
- Information Literacy
As a CQUniversity student you are expected to act honestly in all aspects of your academic work.
Any assessable work undertaken or submitted for review or assessment must be your own work. Assessable work is any type of work you do to meet the assessment requirements in the unit, including draft work submitted for review and feedback and final work to be assessed.
When you use the ideas, words or data of others in your assessment, you must thoroughly and clearly acknowledge the source of this information by using the correct referencing style for your unit. Using others’ work without proper acknowledgement may be considered a form of intellectual dishonesty.
Participating honestly, respectfully, responsibly, and fairly in your university study ensures the CQUniversity qualification you earn will be valued as a true indication of your individual academic achievement and will continue to receive the respect and recognition it deserves.
As a student, you are responsible for reading and following CQUniversity’s policies, including the Student Academic Integrity Policy and Procedure. This policy sets out CQUniversity’s expectations of you to act with integrity, examples of academic integrity breaches to avoid, the processes used to address alleged breaches of academic integrity, and potential penalties.
What is a breach of academic integrity?
A breach of academic integrity includes but is not limited to plagiarism, self-plagiarism, collusion, cheating, contract cheating, and academic misconduct. The Student Academic Integrity Policy and Procedure defines what these terms mean and gives examples.
Why is academic integrity important?
A breach of academic integrity may result in one or more penalties, including suspension or even expulsion from the University. It can also have negative implications for student visas and future enrolment at CQUniversity or elsewhere. Students who engage in contract cheating also risk being blackmailed by contract cheating services.
Where can I get assistance?
For academic advice and guidance, the Academic Learning Centre (ALC) can support you in becoming confident in completing assessments with integrity and of high standard.