List of Questions
What best practices should CLS consider, especially in the areas of cataloging and adding metatags to scientific data?
The Canadian Light Source will be developing a new Scientific Data Management architecture and strategy in 2018. This strategy will cover all aspects of obtaining, storing, and distributing scientific data
What services should the CLS offer based on the available resources?
In order to attract clients, and improve the services offered to staff and users, CLS is exploring options for adding High Performance Computing services. So far, we’ve identified 3 main choices - housed locally, and offsite services from organizations like the University of Saskatchewan and Compute Canada.
What is the best approach to sourcing and screening public source data to incorporate into proprietary data visualization models?
Building a robust data visualization model may require the incorporation of public data, such as Statistics Canada, Bank of Canada, commodities indexes/prices, etc. In many cases, the frequency of source data updates and the interfaces vary considerably. What is the best approach to incorporating and representing multiple external sources of data into dynamic representation models?
How can we capture real time performance data at specific points on all our equipment to help detect early cause for failure and reduce costs?
Refinery Equipment Maintenance Analytics - At our refinery, a single pump failure can cost us thousands of dollars a day in lost production. We will like to use IoT to monitor more key points and equipment, at less cost. We want Data analytics to identify new areas of performance improvement and pinpoint exactly when pump and filter replacement will begin to affect performance.
How do we link our CRM with real time pop product sales to optimize our targeted push marketing efforts and promotions?
Effective Campaign Management - We want to access data on product demand levels on minute-to-minute basis across our fleet of stores. In order to avoid stocking shortfalls and misguided marketing efforts, we don’t want to send consumer a promotion. If a consumer has already bought the product couple of days ago. We want to come up with a list of 10 items per household that we think (based on predictive analytics) they're likely to buy. On the day the flyer comes out, we want to send those households a nice email listing those 10 products, telling them they're on sale at their favorite store.
IBM Deep Thunder is one example. What other models (or potential models) exist that could improve precision and accuracy on short/immediate-term (every 3 hours) weather forecasting for crop farmers in Canada? How can big data and machine learning be used to predict hyper-local (1-2 kilometer resolution or less) forecast, over a 3-48 hour period?
How can we design a system to capture noise data within specific communities in a city that can be used to develop noise reduction strategies and regulations to improve the populations productivity and efficiency!
As more human move to urbanization they level of noise pollution is also on the rise. According to the findings of the World Health Organisation (WHO), noise is the second largest environmental cause of health problems, just after the impact of air quality (particulate matter). In addition to traffic-related noise accounting for more than one million healthy life years lost in Europe, the economic costs of traffic, rail and road noise pollution across the EU were recently estimated at € 40 billion per year (just less than 52 billion U.S. dollars), equivalent to 0.35% of the EU’s GDP. According to the European Commission’s 2011 White Paper on Transport, traffic noise-related external costs will increase €20 billion (about 26 billion U.S. dollars) per year by 2050 (compared to 2005) unless further action is taken. The impact of noise pollution on human health and other species is of serious concern. We need to develop technologies to monitor the level of exposure and find new innovative solutions to minimize the problem.
What tools and frameworks should we consider for analyzing our data? What are the criteria that we need to determine in order to make comparisons?
As we collect and aggregate data, what sorts of things do we need to monitor and what proactive actions can we take to avoid creating data management problems?
Given a dataset with blocks of free-form unstructured text data, what are the best tools or techniques (considering accuracy and minimization of human effort) to extract contextual information?
How can we use advances in computer vision and image detection to automate, increase efficiency, and increase the profitability of todays grain farms?
High intensity agricultural grain production is an industry in transition. With the rapid advancement in sensor, control, automation, and intelligence technologies, intensive grain production is on the cusp of change. It will be important for producers to be in control of these technologies and have the abilities to innovate and scale quickly as the industry transitions. Failure to develop and adopt new technologies will lead to Canadian producers to become non competitive on a local and global scale. This in turn has potential implications to existing trends in agriculture (larger and larger corporate farms) and questions in food security.
linking proteomics to protein functionality from a food perspective
Assessing functionalities of existing and new food proteins, modifying proteins accordingly, and identifying potential food applications of these is a big area we are focusing and lots of demand from the industry for this as well. We are trying to use predictive modeling using available data to guide us identify solutions quickly
linking proteins from Saskatchewan based crops to health and wellness properties
Health benefits of Saskatchewan based crops have reported, but haven't compiled this data and tried to use in an effective manner to develop functional or nutraceutical ingredients. We like to explore this possibility.
Use of previous process data for future process optimization
We do have so many years of data collected by running various projects, and want to make a database using this data and see if those can be used in future process identification/optimization.
How do we create a simple yet effective system to collect, store, analyze and report the workload of the Criminal Investigation Division using existing tools available to the dispatchers and first responders?
Police budgets continue to increase in Canada, while at the same time many parts of the nation are experiencing a reduction in crime rates. Governing bodies continue to be faced with the challenges of funding police budgets during times of increased scrutiny. Measuring the costs of investigations or one single crime is a challenge due to number of variables that can be involved. Police respond to much more than crime, and many of these incidents require more than an emergency response. Understanding the costs associated to investigations including support work following the incident would benefit police agencies across the nation.
Foundations – getting ready for Big Data a. What are the key elements and most efficient ways to acquire analytics capacity that translates to business value at this stage of the analytics evolution? b. What are the foundational considerations for optimal data infrastructure as the size, velocity, type and number of sources of data expands exponentially?
This is a broad question about how we as a company can take foundational steps to ensure that we have the systems and skill sets to thrive in the evolving Big Data competitive environment. a) SGI CANADA has an in-house corporate analytics department and an actuarial department. These groups do modeling using GLMs and Cluster Analysis. This capability is in line with the industry for a company of our size. The pace of change and the capacity and expertise of our largest competitors is evolving so quickly. We need to continue to hire people with new skills sets and develop the existing resources. b. SGI CANADA is looking to migrate its system infrastructure to meet the changing needs of the business. Flexibility, capacity, speed and responsiveness are some of the key characteristics we will need in our future state system architecture.
How can you develop and implement a pricing model that uses machine learning to replace an existing process where a model is reviewed, updated and implemented at regular intervals?
Our current personal lines pricing is based on GLM models. We generally, review the pricing every 6 months and introduce updates as needed. The lag from getting the data to implementing changes to the model can open up gaps in our rating algorithm. The volume of data depends on the line of business and province but in some cases we have around 60000 unique records of data per month. The expectation is that there will be new data sources emerging as well. In some markets in the world, they have pricing that modifies the charged price in real time based on the type of quotes, amount and type of business being converted as they relate to business objectives. We don't currently have any machine learning expertise in house.
Aside from Usage based vehicle insurance, what other types of telematics, data collection devices and frameworks might become key Big Data sources that will drive pricing, product development and customer experience for insurers in the property and casualty space?
SGI CANADA conducted an auto insurance pilot using an onboard diagnostics device to rate policies based on how the vehicle was driven - this is called Usage Based Insurance. There are a lot of pilots in the industry around connected homes, agriculture machninery and other devices that collect huge amounts of data that is not the traditional type, size or velocity of data used in insurance. We want to get clear on what opportunities exist and if feasible to identify an opportunity to pilot an offering. (The usage based insurance pilot used a third party to house and summarize data).
We would like to explore better heat recovery opportunities from product dryers?
In potash we use various different styles of dryers to dry the potash before sizing. We would like to look at opportunities to recover this heat and utilize it in other parts of the process for efficiency and possibly carbon capturing.
We would like to explore alternatives to the heavy liquid method used for mettalurgical liberation studies.
The current heavy liquid method used for liberation studies is a health and safety risk along with a fairly lengthy procedure. We would like to explore alternative methods for this laboratory procedure.
We would like to explore various options for online monitoring of belt wear.
In potash there are many conveyors and equipment that use belts. we would like to explore options for only wear and/or fatigue monitoring to help monitor failures before they occur critically.
Is there opportunity to develop an online potassium grade monitoring instrument for dry processes?
Currently samples are taken from the process, sent to the quality lab for determining potassium levels. We would like to explore cheaper ways of dry grading potassium rather than using isotopic instruments such as K40 probes.