Home NewsX MGDC for SharePoint FAQ: How do I join File Actions with Files?

MGDC for SharePoint FAQ: How do I join File Actions with Files?

by info.odysseyx@gmail.com
0 comment 1 views


1. File operations

If you use the SharePoint File Operations dataset in Microsoft Graph Data Connect, you can combine it with the SharePoint File dataset to get detailed information about each file. This post explains how to do this, including the most common issues.

2. What are the file operations?

The SharePoint file activity dataset contains details about every time a file is accessed, deleted, downloaded, modified, moved, renamed, or uploaded. Information about each operation includes the type of operation, when the operation occurred, the file on which the operation was performed, the application used, and the user (actor) who performed the operation.

For more information about the dataset, see the full schema documentation at: dataconnection-dataset-sharepointfileactions.md.

3. Combine with other datasets

A common use for this dataset is to combine it with SharePoint sites or SharePoint files to understand the amount of activity going on for a particular site or file. You can use it to see which sites or files are most accessed on a particular day. By accumulating activity over a long period of time, you can find sites that have had no activity in the last 90 days.

To find details about sites related to a specific task, group your file tasks by site ID and then join the site. Below is an example SQL to get a list of the most popular sites for the time period covered by the file operation.

WITH ActionsBySite AS (
    SELECT
        SiteId,
        MIN(ActionDate) AS EarliestAction,
        MAX(ActionDate) AS LatestAction,
        COUNT(*) AS ActionCount
    FROM FileActions
    GROUP BY SiteId
)
SELECT
    S.Id AS SiteId,
    S.[RootWeb.Title] AS SiteName,
    S.CreatedTime AS SiteCreated,
    S.[RootWeb.LastItemModifiedDate] AS SiteLastModified,
    S.[RootWeb.TemplateId] AS SiteTemplateId,
    A.EarliestAction,
    A.LatestAction,
    A.ActionCount
FROM ActionsBySite AS A
JOIN Sites S
ON A.SiteId = S.Id
ORDER BY ActionCount DESC
LIMIT 10;

To find details about files related to a specific task, group file tasks by file and then combine them with the file. Here’s a similar example that gets a list of the most popular files for the time period to which a file operation applies:

WITH ActionsByFile AS (
    SELECT
        SiteId,
        WebId,
        ListId,
        ListItemId,
        MIN(ActionDate) AS EarliestAction,
        MAX(ActionDate) AS LatestAction,
        COUNT(*) AS ActionCount
    FROM FileActions
    GROUP BY SiteId, WebId, ListId, ListItemId
)
SELECT
    F.DirName,
    F.FileName,
    F.AuthorEmail,
    F.TimeCreated AS FileCreated,
    F.TimeLastModified AS FileLastModified,
    A.EarliestAction,
    A.LatestAction,
    A.ActionCount
FROM ActionsByFile AS A
JOIN Files F
ON A.SiteId = F.SiteId
AND A.WebId = F.WebId
AND A.ListId = F.ListId
AND A.ListItemId = F.ItemId
ORDER BY ActionCount DESC
LIMIT 10;

Note: If you authorize long-term retention of file activity data, please contact our Compliance team. There may be limits to how long we must keep your personal information.

4. Potential problems

For large data sets, such as files and file operations, there are scenarios where JOIN does not match perfectly. Here are some issues you may encounter when combining file operations with sites or files:

4a. other regions

Important: This section only applies to tenants using the Microsoft 365 Multi-Geo feature, as described next. Microsoft 365 multi-region and Multi-region functionality in OneDrive and SharePoint.

A JOIN operation excludes certain operations if the file operations are in a different area than the file. If your tenant has multiple regions Actions are recorded in the actor’s domain.. Actors in that region can access files in other regions that are not in that environment. To overcome this, you need to run a collection of SharePoint files from all regions and combine (unify) them into a single data set.

Jose_Barreto_0-1727728773960.png

Note: These datasets are segmented by region for compliance purposes, so your compliance team should run this scenario before combining data from multiple regions.

4b. matching date

There are certain situations where files are missing when attempting to join because the file operation snapshot is newer than the file snapshot date. For example, if a file activity snapshot from July 15th is combined with a file snapshot from July 1st, by default the JOIN excludes the file activity because the file data has not yet been created.

Additionally, it may take approximately a week for the SharePoint file dataset to be fully updated. You can exclude operations on files that were recently created but not yet included in the file dataset. So even if both the file activity snapshot and the file snapshot are from July 1, it is possible that some of the more recent files may not have been captured yet.

There is also a chance that some file operations will arrive late (typically less than 0.01%). For example, some of your July 1st tasks may appear in the July 2nd snapshot.

Therefore, it is generally recommended to wait until July 8 to process and report July 1 work to obtain the most complete data set. This ensures that you use the most up-to-date file data sets and tolerates late-arriving file operation data.

4c. Not a document library

Another reason for not finding a match between your file operations dataset and your file operations data set is when the file operation is for a file that is not in the document library. The SharePoint file dataset includes only files in document libraries and ignores other types of lists. This is to keep the file data set at a more reasonable size.

A common scenario here involves working with SharePoint pages (files with the “aspx” extension), usually located on a communication site. Although you won’t see these “aspx” pages in the file dataset, you can still count the number of jobs per site.

To bridge this gap, you need to provide additional detail by adding specific datasets for your SharePoint pages and/or SharePoint lists. This is something the team is considering in the future, but there is no expected date for this.

4d. List item ID is missing.

The file activity dataset shows telemetry reported by your application. For some of these applications, the ListItemId may not be reported or may be reported as “null”. In this case, the join excludes that operation.

You can summarize these operations at the site level even if the ListItemId is missing. You can also view a summary of user agents to identify specific applications that are not providing data.

5. Conclusion

We hope you enjoyed learning more about SharePoint file operations datasets in Microsoft Graph Data Connect. More information about Microsoft Graph Data Connect for SharePoint can be found here: https://aka.ms/SharePointData. You can find a lot of details here, including a list of available datasets and frequently asked questions.





Source link

You may also like

Leave a Comment

Our Company

Welcome to OdysseyX, your one-stop destination for the latest news and opportunities across various domains.

Newsletter

Subscribe my Newsletter for new blog posts, tips & new photos. Let's stay updated!

Laest News

@2024 – All Right Reserved. Designed and Developed by OdysseyX