{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Content Management: Identify insecure items\n",
    "\n",
    "> * 👟 Ready To Run!\n",
    "* 🗄️ Administration\n",
    "* 📦 Content Management\n",
    "\n",
    "Items of type WebMap, WebScene, or App contain collections of layers, basemaps, and other external services hosted on ArcGIS Online/Server. These services can be connected to via `http://` or `https://`, with HTTPS being the more secure protocol since it encrypts the connection. __It is recommended that all service URLs use the `https://` (or say, SSL) protocol__.\n",
    "\n",
    "This notebook will search through all WebMap/WebScene/App Items in an Organization, identifying the 'insecure' ones if one or more service URLs use `http://`. These items will be displayed in this notebook, persisted in `.csv` files, and can have the `potentially_insecure` tag added to them."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "toc": true
   },
   "source": [
    "<h1>**Table of Contents**<span class=\"tocSkip\"></span></h1>\n",
    "<div class=\"toc\"><ul class=\"toc-item\"><li><span><a href=\"#Identify-Items-That-Use-Insecure-URLs\" data-toc-modified-id=\"Identify-Items-That-Use-Insecure-URLs-1\">Identify Items That Use Insecure URLs</a></span><ul class=\"toc-item\"><li><span><a href=\"#Configure-Behavior\" data-toc-modified-id=\"Configure-Behavior-1.1\">Configure Behavior</a></span></li><li><span><a href=\"#Detecting-http-vs-https\" data-toc-modified-id=\"Detecting-http-vs-https-1.2\">Detecting http vs https</a></span><ul class=\"toc-item\"><li><span><a href=\"#WebMaps\" data-toc-modified-id=\"WebMaps-1.2.1\">WebMaps</a></span></li><li><span><a href=\"#WebScenes\" data-toc-modified-id=\"WebScenes-1.2.2\">WebScenes</a></span></li><li><span><a href=\"#Apps\" data-toc-modified-id=\"Apps-1.2.3\">Apps</a></span></li></ul></li><li><span><a href=\"#Output-CSV-Files\" data-toc-modified-id=\"Output-CSV-Files-1.3\">Output CSV Files</a></span></li><li><span><a href=\"#Miscellaneous-Functionality\" data-toc-modified-id=\"Miscellaneous-Functionality-1.4\">Miscellaneous Functionality</a></span></li><li><span><a href=\"#main()\" data-toc-modified-id=\"main()-1.5\">main()</a></span></li><li><span><a href=\"#Post-Processing\" data-toc-modified-id=\"Post-Processing-1.6\">Post Processing</a></span></li></ul></li><li><span><a href=\"#Conclusion\" data-toc-modified-id=\"Conclusion-2\">Conclusion</a></span><ul class=\"toc-item\"><li><ul class=\"toc-item\"><li><span><a href=\"#Rewrite-this-Notebook\" data-toc-modified-id=\"Rewrite-this-Notebook-2.0.1\">Rewrite this Notebook</a></span></li><li><span><a href=\"#Related-Notebooks\" data-toc-modified-id=\"Related-Notebooks-2.0.2\">Related Notebooks</a></span></li></ul></li></ul></li></ul></div>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To get started, import the necessary libraries and connect to our GIS:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import csv\n",
    "import time\n",
    "from IPython.display import display, HTML\n",
    "import json\n",
    "import pandas\n",
    "import logging\n",
    "log = logging.getLogger()\n",
    "\n",
    "from arcgis.mapping import WebMap\n",
    "from arcgis.mapping import WebScene\n",
    "from arcgis.gis import GIS\n",
    "\n",
    "gis = GIS(\"home\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Configure Behavior\n",
    "\n",
    "Now, let's configure some variables specific to our organization that will tell our notebook how we want it to run. With the default `CHECK_ALL_ITEMS` set to `True`, this notebook will apply this check to all items an Organization. If you would instead prefer to only apply this check to certain groups of items, set `CHECK_ALL_ITEMS` to `False`, then set `GROUP_NAMES` to a list of group name strings.\n",
    "\n",
    "Modify the below cell to change that default behavior."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Set to `True` if you would like to check ALL items in an org\n",
    "CHECK_ALL_ITEMS = True\n",
    "# If `CHECK_ALL_ITEMS` is `False`, then it will check all items in these groups\n",
    "CHECK_THESE_GROUPS = ['group_name_1', 'group_name_2']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's specify what types of items we want to test. By default, this notebook will check `WebMap`, `WebScene`, and any `App` items.\n",
    "\n",
    "Modify the below cell to change that default behavior."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "CHECK_WEBMAPS = True\n",
    "CHECK_WEBSCENES = True\n",
    "CHECK_APPS = True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's specify what kind of behavior we want when we come across an insecure item. This notebook will automatically sort and display the insecure and secure items, but we can also configure if we want to add a `potentially_insecure` tag to all insecure items.\n",
    "\n",
    "The default behavior is __NOT__ to add the tag. Modify the below cell to change that default behavior."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "TRY_TAG_INSECURE_ITEMS = False"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Detecting http vs https\n",
    "\n",
    "A core component of this notebook will be detecting if a URL is `http://` or `https://`. We will do this by creating helper functions that use the built-in string library to see what the URL string starts with."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "def is_https(url):\n",
    "    return str(url).startswith(\"https:/\")\n",
    "\n",
    "def is_http(url):\n",
    "    return str(url).startswith(\"http:/\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### WebMaps\n",
    "\n",
    "This code cell defines a function that will test all URLs in a web map item; it will return the URLs that use `https://` and the URLs that use `http://`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_https_in_webmap(webmap_item):\n",
    "    \"\"\"Takes in an `Item` class instance of a Web Map Item.\n",
    "    Sorts all operational layers and basemap layers based on if\n",
    "    they are http or https, returns a tuple of \n",
    "    (https_urls, http_urls), with each being a list of URLs\n",
    "    \"\"\"\n",
    "    https_urls = []\n",
    "    http_urls = []\n",
    "    wm = WebMap(webmap_item)\n",
    "\n",
    "    # Concatenate all operational layers and basemap layers to one list\n",
    "    all_layers = list(wm.layers)\n",
    "    if hasattr(wm.basemap, 'baseMapLayers'):\n",
    "        all_layers += wm.basemap.baseMapLayers\n",
    "\n",
    "    # Test all of the layers, return the results\n",
    "    for layer in [layer for layer in all_layers \\\n",
    "                  if hasattr(layer, 'url')]:\n",
    "        if is_https(layer.url):\n",
    "            log.debug(f\"    [✓] url {layer['url']} is https\")\n",
    "            https_urls.append(layer.url)\n",
    "        elif is_http(layer.url):\n",
    "            log.debug(f\"    [X] url {layer['url']} is http\")\n",
    "            http_urls.append(layer.url)\n",
    "    return (https_urls, http_urls)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### WebScenes\n",
    "\n",
    "This code cell defines a function that will test all URLs in a web scene item; it will return the URLs that use `https://` and the URLs that use `http://`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_https_in_webscene(webscene_item):\n",
    "    \"\"\"Takes in an `Item` class instance of a web scene item.\n",
    "    Sorts all operational layers and basemap layers based on if\n",
    "    they are http or https, returns a tuple of \n",
    "    (https_urls, http_urls), with each being a list of URLs\n",
    "    \"\"\"\n",
    "    https_urls = []\n",
    "    http_urls = []\n",
    "    ws = WebScene(webscene_item)\n",
    "\n",
    "    # Concatenate all operational layers and basemap layers to 1 list\n",
    "    all_layers = []\n",
    "    for operationalLayer in ws.get('operationalLayers', []):\n",
    "        if 'layers' in operationalLayer:\n",
    "            for layer in operationalLayer['layers']:\n",
    "                all_layers.append(layer)\n",
    "        else:\n",
    "            all_layers.append(operationalLayer)\n",
    "    for bm_layer in ws.get('baseMap', {}).get('baseMapLayers', []):\n",
    "        all_layers.append(bm_layer)\n",
    "\n",
    "    # Test all of the layers, return the results\n",
    "    for layer in [layer for layer in all_layers \\\n",
    "                  if layer.get('url', False)]:\n",
    "        if is_https(layer.get('url', False)):\n",
    "            log.debug(f\"    [✓] url {layer['url']} is https\")\n",
    "            https_urls.append(layer['url'])\n",
    "        elif is_http(layer.get('url', False)):\n",
    "            log.debug(f\"    [X] url {layer['url']} is http\")\n",
    "            http_urls.append(layer['url'])\n",
    "    return (https_urls, http_urls)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Apps\n",
    "\n",
    "This code cell defines a function that will test all URLs in an app item; it will return the URLs that use `https://` and the URLs that use `http://`.\n",
    "\n",
    ">__Note__: App items don't have as standardized of JSON format as WebMaps and WebScenes. Therefore, the logic used to detect URLs in App Items will test every nested value in the dictionary returned from a `get_data()` call."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_values_recurs(dict_):\n",
    "    \"\"\"Helper function to get all nested values in a dict.\"\"\"\n",
    "    output = []\n",
    "    if isinstance(dict_, dict):\n",
    "        for value in dict_.values():\n",
    "            if isinstance(value, dict):\n",
    "                output += get_values_recurs(value)\n",
    "            elif isinstance(value, list):\n",
    "                for entry in value:\n",
    "                    output += get_values_recurs({\"_\":entry})\n",
    "            else:\n",
    "                output += [value,]\n",
    "    return output\n",
    "\n",
    "def test_https_in_app(app_item):\n",
    "    \"\"\"Takes in an `Item` class instance of any 'App' Item.\n",
    "    Will call `.get_data()` on the Item, and will search through\n",
    "    EVERY value nested inside the data dict, sorting each URL\n",
    "    found to either `https_urls` or `http_urls`, returning the \n",
    "    tuple of (https_urls, http_url)\n",
    "    \"\"\"\n",
    "    https_urls = []\n",
    "    http_urls = []\n",
    "    all_values = get_values_recurs(app_item.get_data())\n",
    "    for value in all_values:\n",
    "        if is_https(value):\n",
    "            https_urls.append(value)\n",
    "        elif is_http(value):\n",
    "            http_urls.append(value)\n",
    "    return (https_urls, http_urls)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The previously defined `test_https_...()` functions all follow a similar prototype of returning a tuple of `(https_urls, http_urls)`. We can therefore define a helper function will sort for us and call the correct function, based on the `item.type` property and the previously defined configuration variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "def test_https_for(item):\n",
    "    \"\"\"Given an `Item` instance, call the correct function and return \n",
    "    (https_urls, http_urls). Will return (None, None) if the item type \n",
    "    is not supported, or if configured to not check that item type.\n",
    "    \"\"\"\n",
    "    if (item.type == \"Web Map\") and CHECK_WEBMAPS:\n",
    "        return test_https_in_webmap(item)\n",
    "    elif (item.type == \"Web Scene\") and CHECK_WEBSCENES:\n",
    "        return test_https_in_webscene(item)\n",
    "    elif (\"App\" in item.type) and CHECK_APPS:\n",
    "        return test_https_in_app(item)\n",
    "    else:\n",
    "        return ([],[])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Output CSV Files\n",
    "\n",
    "We will be persisting the results of this notebook as two `.csv` files in the `/arcgis/home` folder, which will then also publish to our Organization.\n",
    "\n",
    "One `.csv` file (`ALL_URLS.csv`) will contain one row per URL. This file will contain an in-depth, comprehensive look of all secure/insecure URLs and how they related to items. This file is best analyzed by filtering in desktop spreadsheet software, manipulating in a `pandas` DataFrame, etc.\n",
    "\n",
    "The other `.csv` file (`INSECURE_ITEMS.csv`) will contain one row per Item. This will be a useful, 'human-readable' table that will give us a quick insight into what items contain insecure URLs.\n",
    "\n",
    "Let's create a `create_csvs()` function that creates these files with the appropriate columns and unique filenames; it will be called on notebook start."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "insecure_items_columns = ['item_id', 'item_title', 'item_url',\n",
    "                         'item_type', 'https_urls', 'http_urls']\n",
    "all_urls_columns = ['url', 'is_secure', 'item_id', \n",
    "                    'item_title', 'item_url', 'item_type']\n",
    "workspace = \"/arcgis/home\"\n",
    "\n",
    "def create_csvs():\n",
    "    \"\"\"When called, will create the two output .csv files with unique \n",
    "    filenames. Returns a tuple of the string file paths\n",
    "    (all_urls_path, insecure_items_path)\n",
    "    \"\"\"\n",
    "    all_urls_path = f'{workspace}/ALL_URLs-{time.ctime()}.csv'\n",
    "    insecure_items_path = f'{workspace}/INSECURE_ITEMS-{time.ctime()}.csv'\n",
    "    for file_path, columns in [(all_urls_path, all_urls_columns),\n",
    "                   (insecure_items_path, insecure_items_columns)]:\n",
    "        with open(file_path, 'w') as file:\n",
    "            writer = csv.DictWriter(file, columns)\n",
    "            writer.writeheader()\n",
    "    return (all_urls_path, insecure_items_path)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that the `.csv` files have been made with the correct headers/columns, we can create a function to add a row to the `ALL_URLS.csv` file. Each URL gets its own row, an `is_secure` boolean, and information related to the item the URL came from (item id, item type, etc.)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "def write_row_to_urls_csv(url, is_secure, item, file_path):\n",
    "    \"\"\"Given any URL from an item we've tested, write a\n",
    "    row to the output 'ALL_URLs.csv', located at `file_path`. This .csv\n",
    "    will have 1 row per URL, with information such as an `is_secure`\n",
    "    boolean, information about the item that contained the URL, etc.\n",
    "    \"\"\"\n",
    "    with open(file_path, 'a') as file:\n",
    "        writer = csv.DictWriter(file, all_urls_columns)\n",
    "        writer.writerow({'url' : url,\n",
    "                         'is_secure' : is_secure,\n",
    "                         'item_id' : item.id,\n",
    "                         'item_title' : item.title,\n",
    "                         'item_url' : item.homepage,\n",
    "                         'item_type' : item.type})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we can create a function to add a row to the `INSECURE_ITEMS.csv` file. In this file, each Item gets its own row, with related information like its item id, item url, a JSON representation of the https_urls, a JSON representation of http_urls, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "def write_row_to_insecure_csv(item, https_urls, http_urls, file_path):\n",
    "    \"\"\"Given an insecure item, write a row to the output \n",
    "    'INSECURE_URLS.csv' file, located at `file_path`. This .csv will \n",
    "    have 1 row per item, with information such as the item's ID,the \n",
    "    item's URL, a JSON representation of the list of http_urls and \n",
    "    https_urls, etc.\n",
    "    \"\"\"\n",
    "    with open(file_path, 'a') as file:\n",
    "        writer = csv.DictWriter(file, insecure_items_columns)\n",
    "        writer.writerow({'item_id' : item.id,\n",
    "                         'item_title' : item.title,\n",
    "                         'item_url' : item.homepage,\n",
    "                         'item_type' : item.type,\n",
    "                         'https_urls' : json.dumps(https_urls),\n",
    "                         'http_urls' : json.dumps(http_urls)})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Miscellaneous Functionality\n",
    "\n",
    "Another way we can persist the results from this notebook is to attempt to add a tag of `potentially_insecure` to all the insecure items we find via this function.\n",
    "\n",
    "> __Note__: An exception will NOT be thrown if an item's tag cannot be updated due to permissions, not being the item owner, etc. A warning message will be logged, but the function will return and the notebook will continue."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "def try_tag_item_as_insecure(item):\n",
    "    \"\"\"Will attempt to add a tag to the item that will mark it as \n",
    "    potentially insecure. If the tag cannot be updated (permissions,\n",
    "    not the owner, etc.), this function will still return, but it\n",
    "    will print out a WARNING message\n",
    "    \"\"\"\n",
    "    try:\n",
    "        tag_to_add = \"potentially_insecure\"\n",
    "        if tag_to_add not in item.tags:\n",
    "            item.update({'tags': item.tags + [tag_to_add]})\n",
    "    except Exception as e:\n",
    "        log.warning(f\"Could not tag item {item.id} as '{tag_to_add}'...\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's create a generator function that will `yield` `Item`(s). This notebook can run against all items in an organization, or all items from certain groups, depending on the value of the previously defined configuration variables."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_items_to_check():\n",
    "    \"\"\"Generator function that will yield Items depending on how you \n",
    "    configured your notebook. Will either yield every item in an \n",
    "    organization, or will yield items in specific groups.\n",
    "    \"\"\"\n",
    "    if CHECK_ALL_ITEMS:\n",
    "        for user in gis.users.search():\n",
    "            for item in user.items(max_items=999999999):\n",
    "                # For the user's root folder\n",
    "                yield item\n",
    "            for folder in user.folders:\n",
    "                # For all the user's other folders\n",
    "                for item in user.items(folder, max_items=999999999):\n",
    "                    yield item\n",
    "    else:\n",
    "        for group_name in CHECK_THESE_GROUPS:\n",
    "            group = gis.groups.search(f\"title: {group_name}\")[0]\n",
    "            for item in group.content():\n",
    "                yield item"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## main()\n",
    "\n",
    "Finally, let's create our `main()` function that links together all our previously defined functions that get all our web maps, web scenes, and apps, test the items, and write the results to the correct places."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "# After running main(), these in-memory variables will be populated\n",
    "secure_items = []\n",
    "insecure_items = []\n",
    "all_urls_csv_item = None\n",
    "insecure_items_csv_item = None\n",
    "\n",
    "def main():\n",
    "    # Tell user we're running, initialize variables/files\n",
    "    print(\"Notebook is now running, please wait...\\n-----\")\n",
    "    global secure_items, insecure_items, \\\n",
    "        all_urls_csv_item, insecure_items_csv_item\n",
    "    secure_items = []\n",
    "    insecure_items = []\n",
    "    all_urls_path, insecure_items_path = create_csvs()\n",
    "    \n",
    "    # Test each item, write to the appropriate file\n",
    "    for item in get_items_to_check():\n",
    "        print(f\"\\rChecking item {item.id}\", end=\"\")\n",
    "        https_urls, http_urls = test_https_for(item)\n",
    "        \n",
    "        # add all the item's URLs to the 'ALL_URLs.csv' output file\n",
    "        for urls, is_secure in [(https_urls, True), (http_urls, False)]:\n",
    "            for url in urls:\n",
    "                write_row_to_urls_csv(url, is_secure, \n",
    "                                      item, all_urls_path)\n",
    "\n",
    "        # If the item is insecure, add to 'INSECURE_ITEMS.csv' out file\n",
    "        if http_urls:\n",
    "            insecure_items.append(item)\n",
    "            write_row_to_insecure_csv(item, https_urls, http_urls,\n",
    "                                      insecure_items_path)\n",
    "            if TRY_TAG_INSECURE_ITEMS:\n",
    "                try_tag_item_as_insecure(item)\n",
    "        elif https_urls:\n",
    "            secure_items.append(item)\n",
    "\n",
    "    # Publish the csv files, display them in the notebook\n",
    "    display(HTML(\"<h1><u>RESULTS</u><h1>\"))\n",
    "    all_urls_csv_item = gis.content.add({}, all_urls_path)\n",
    "    display(all_urls_csv_item)\n",
    "    insecure_items_csv_item = gis.content.add({}, insecure_items_path)\n",
    "    display(insecure_items_csv_item)\n",
    "\n",
    "    # Display the items with insecure URLs\n",
    "    max_num_items_to_display = 10\n",
    "    display(HTML(f\"<h3>{len(insecure_items)} ITEMS \"\\\n",
    "                 \"USE INSECURE URLs</h3>\"))\n",
    "    for item in insecure_items[0:max_num_items_to_display]:\n",
    "        display(item)\n",
    "\n",
    "    # Tell user we're finished\n",
    "    print(\"-----\\nNotebook completed running.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have just defined a `main()` function, but we haven't called it yet. If you've modified the notebook, follow these steps:\n",
    "1. __Double check the notebook content__. Make sure no secrets are visible in the notebook, delete unused code, refactor, etc.\n",
    "2. Save the notebook\n",
    "3. In the 'Kernel' menu, press 'Restart and Run All' to run the whole notebook from top to bottom\n",
    "\n",
    "Now, `main()` can be called."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Notebook is now running, please wait...\n",
      "-----\n",
      "Checking item 7564cb4c28604841925ac0703504e6ea"
     ]
    },
    {
     "data": {
      "text/html": [
       "<h1><u>RESULTS</u><h1>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div class=\"item_container\" style=\"height: auto; overflow: hidden; border: 1px solid #cfcfcf; border-radius: 2px; background: #f6fafa; line-height: 1.21429em; padding: 10px;\">\n",
       "                    <div class=\"item_left\" style=\"width: 210px; float: left;\">\n",
       "                       <a href='https://datascienceqa.esri.com/portal/home/item.html?id=43dac0a79f144673bb96bc0ea55adf26' target='_blank'>\n",
       "                        \n",
       "                       </a>\n",
       "                    </div>\n",
       "\n",
       "                    <div class=\"item_right\"     style=\"float: none; width: auto; overflow: hidden;\">\n",
       "                        <a href='https://datascienceqa.esri.com/portal/home/item.html?id=43dac0a79f144673bb96bc0ea55adf26' target='_blank'><b>ALL_URLs-Fri Jan 18 22:41:57 2019</b>\n",
       "                        </a>\n",
       "                        <br/>CSV by portaladmin\n",
       "                        <br/>Last Modified: January 18, 2019\n",
       "                        <br/>0 comments, 0 views\n",
       "                    </div>\n",
       "                </div>\n",
       "                "
      ],
      "text/plain": [
       "<Item title:\"ALL_URLs-Fri Jan 18 22:41:57 2019\" type:CSV owner:portaladmin>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div class=\"item_container\" style=\"height: auto; overflow: hidden; border: 1px solid #cfcfcf; border-radius: 2px; background: #f6fafa; line-height: 1.21429em; padding: 10px;\">\n",
       "                    <div class=\"item_left\" style=\"width: 210px; float: left;\">\n",
       "                       <a href='https://datascienceqa.esri.com/portal/home/item.html?id=bc81b70910944fd39b474d90a0fc0a6d' target='_blank'>\n",
       "                        \n",
       "                       </a>\n",
       "                    </div>\n",
       "\n",
       "                    <div class=\"item_right\"     style=\"float: none; width: auto; overflow: hidden;\">\n",
       "                        <a href='https://datascienceqa.esri.com/portal/home/item.html?id=bc81b70910944fd39b474d90a0fc0a6d' target='_blank'><b>INSECURE_ITEMS-Fri Jan 18 22:41:57 2019</b>\n",
       "                        </a>\n",
       "                        <br/>CSV by portaladmin\n",
       "                        <br/>Last Modified: January 18, 2019\n",
       "                        <br/>0 comments, 0 views\n",
       "                    </div>\n",
       "                </div>\n",
       "                "
      ],
      "text/plain": [
       "<Item title:\"INSECURE_ITEMS-Fri Jan 18 22:41:57 2019\" type:CSV owner:portaladmin>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<h3>2 ITEMS USE INSECURE URLs</h3>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div class=\"item_container\" style=\"height: auto; overflow: hidden; border: 1px solid #cfcfcf; border-radius: 2px; background: #f6fafa; line-height: 1.21429em; padding: 10px;\">\n",
       "                    <div class=\"item_left\" style=\"width: 210px; float: left;\">\n",
       "                       <a href='https://datascienceqa.esri.com/portal/home/item.html?id=2b7bfdc9ea45431b8390b5425466dfe2' target='_blank'>\n",
       "                        \n",
       "                       </a>\n",
       "                    </div>\n",
       "\n",
       "                    <div class=\"item_right\"     style=\"float: none; width: auto; overflow: hidden;\">\n",
       "                        <a href='https://datascienceqa.esri.com/portal/home/item.html?id=2b7bfdc9ea45431b8390b5425466dfe2' target='_blank'><b>InsecureWM_World4_ForSelection1</b>\n",
       "                        </a>\n",
       "                        <br/>Insecure WM Web Map by portaladmin\n",
       "                        <br/>Last Modified: January 18, 2019\n",
       "                        <br/>0 comments, 2 views\n",
       "                    </div>\n",
       "                </div>\n",
       "                "
      ],
      "text/plain": [
       "<Item title:\"InsecureWM_World4_ForSelection1\" type:Web Map owner:portaladmin>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<div class=\"item_container\" style=\"height: auto; overflow: hidden; border: 1px solid #cfcfcf; border-radius: 2px; background: #f6fafa; line-height: 1.21429em; padding: 10px;\">\n",
       "                    <div class=\"item_left\" style=\"width: 210px; float: left;\">\n",
       "                       <a href='https://datascienceqa.esri.com/portal/home/item.html?id=0db3c0cb32574c328ef8f26f667f886f' target='_blank'>\n",
       "                        \n",
       "                       </a>\n",
       "                    </div>\n",
       "\n",
       "                    <div class=\"item_right\"     style=\"float: none; width: auto; overflow: hidden;\">\n",
       "                        <a href='https://datascienceqa.esri.com/portal/home/item.html?id=0db3c0cb32574c328ef8f26f667f886f' target='_blank'><b>InsecureWS_World4_ForSelection1_3D</b>\n",
       "                        </a>\n",
       "                        <br/>Insecure WS Web Scene by portaladmin\n",
       "                        <br/>Last Modified: January 18, 2019\n",
       "                        <br/>0 comments, 2 views\n",
       "                    </div>\n",
       "                </div>\n",
       "                "
      ],
      "text/plain": [
       "<Item title:\"InsecureWS_World4_ForSelection1_3D\" type:Web Scene owner:portaladmin>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-----\n",
      "Notebook completed running.\n"
     ]
    }
   ],
   "source": [
    "main()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If configured correctly, this notebook should have output two `.csv` files that can help you identify items that use insecure URLs."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Post Processing\n",
    "\n",
    "The `ALL_URLS.csv` file/item contains an in-depth, comprehensive look at all secure and insecure URLs and how they relate to items. This file contains a lot of information, which can be better analyzed using the `pandas` package. This code cell will convert any `.csv` Item to a pandas `DataFrame`; we will be converting the `ALL_URLS.csv` file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>url</th>\n",
       "      <th>is_secure</th>\n",
       "      <th>item_id</th>\n",
       "      <th>item_title</th>\n",
       "      <th>item_url</th>\n",
       "      <th>item_type</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>https://datascienceqa.esri.com/server/rest/ser...</td>\n",
       "      <td>True</td>\n",
       "      <td>1e44753d695e4abb860a74fdf091a5f7</td>\n",
       "      <td>01182019_WM_World_OnlyOpLayers</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Map</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>https://datascienceqa.esri.com/server/rest/ser...</td>\n",
       "      <td>True</td>\n",
       "      <td>1e44753d695e4abb860a74fdf091a5f7</td>\n",
       "      <td>01182019_WM_World_OnlyOpLayers</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Map</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>https://datascienceqa.esri.com/server/rest/ser...</td>\n",
       "      <td>True</td>\n",
       "      <td>1e44753d695e4abb860a74fdf091a5f7</td>\n",
       "      <td>01182019_WM_World_OnlyOpLayers</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Map</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>https://datascienceqa.esri.com/server/rest/ser...</td>\n",
       "      <td>True</td>\n",
       "      <td>1e44753d695e4abb860a74fdf091a5f7</td>\n",
       "      <td>01182019_WM_World_OnlyOpLayers</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Map</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>https://tunnelview.esri.com:6443/arcgis/rest/s...</td>\n",
       "      <td>True</td>\n",
       "      <td>1e44753d695e4abb860a74fdf091a5f7</td>\n",
       "      <td>01182019_WM_World_OnlyOpLayers</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Map</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                 url  is_secure  \\\n",
       "0  https://datascienceqa.esri.com/server/rest/ser...       True   \n",
       "1  https://datascienceqa.esri.com/server/rest/ser...       True   \n",
       "2  https://datascienceqa.esri.com/server/rest/ser...       True   \n",
       "3  https://datascienceqa.esri.com/server/rest/ser...       True   \n",
       "4  https://tunnelview.esri.com:6443/arcgis/rest/s...       True   \n",
       "\n",
       "                            item_id                      item_title  \\\n",
       "0  1e44753d695e4abb860a74fdf091a5f7  01182019_WM_World_OnlyOpLayers   \n",
       "1  1e44753d695e4abb860a74fdf091a5f7  01182019_WM_World_OnlyOpLayers   \n",
       "2  1e44753d695e4abb860a74fdf091a5f7  01182019_WM_World_OnlyOpLayers   \n",
       "3  1e44753d695e4abb860a74fdf091a5f7  01182019_WM_World_OnlyOpLayers   \n",
       "4  1e44753d695e4abb860a74fdf091a5f7  01182019_WM_World_OnlyOpLayers   \n",
       "\n",
       "                                            item_url item_type  \n",
       "0  https://datascienceqa.esri.com/portal/home/ite...   Web Map  \n",
       "1  https://datascienceqa.esri.com/portal/home/ite...   Web Map  \n",
       "2  https://datascienceqa.esri.com/portal/home/ite...   Web Map  \n",
       "3  https://datascienceqa.esri.com/portal/home/ite...   Web Map  \n",
       "4  https://datascienceqa.esri.com/portal/home/ite...   Web Map  "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "def csv_item_to_dataframe(item):\n",
    "    \"\"\"Takes in an Item instance of a `.csv` file,\n",
    "    returns a pandas DataFrame\n",
    "    \"\"\"\n",
    "    if item is not None:\n",
    "        downloaded_csv_file_path = item.download()\n",
    "        return pandas.read_csv(downloaded_csv_file_path)\n",
    "    else:\n",
    "        print(\"csv item not downloaded\")\n",
    "        return None\n",
    "\n",
    "df = csv_item_to_dataframe(all_urls_csv_item)\n",
    "if df is not None:\n",
    "    display(df.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that you have a pandas `DataFrame` instance, you can run `query()` on it,"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>url</th>\n",
       "      <th>is_secure</th>\n",
       "      <th>item_id</th>\n",
       "      <th>item_title</th>\n",
       "      <th>item_url</th>\n",
       "      <th>item_type</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>13</th>\n",
       "      <td>http://tunnelview.esri.com:6080/arcgis/rest/se...</td>\n",
       "      <td>False</td>\n",
       "      <td>2b7bfdc9ea45431b8390b5425466dfe2</td>\n",
       "      <td>InsecureWM_World4_ForSelection1</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Map</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>14</th>\n",
       "      <td>http://tunnelview.esri.com:6080/arcgis/rest/se...</td>\n",
       "      <td>False</td>\n",
       "      <td>2b7bfdc9ea45431b8390b5425466dfe2</td>\n",
       "      <td>InsecureWM_World4_ForSelection1</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Map</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>20</th>\n",
       "      <td>http://tunnelview.esri.com:6080/arcgis/rest/se...</td>\n",
       "      <td>False</td>\n",
       "      <td>0db3c0cb32574c328ef8f26f667f886f</td>\n",
       "      <td>InsecureWS_World4_ForSelection1_3D</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Scene</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>21</th>\n",
       "      <td>http://tunnelview.esri.com:6080/arcgis/rest/se...</td>\n",
       "      <td>False</td>\n",
       "      <td>0db3c0cb32574c328ef8f26f667f886f</td>\n",
       "      <td>InsecureWS_World4_ForSelection1_3D</td>\n",
       "      <td>https://datascienceqa.esri.com/portal/home/ite...</td>\n",
       "      <td>Web Scene</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                  url  is_secure  \\\n",
       "13  http://tunnelview.esri.com:6080/arcgis/rest/se...      False   \n",
       "14  http://tunnelview.esri.com:6080/arcgis/rest/se...      False   \n",
       "20  http://tunnelview.esri.com:6080/arcgis/rest/se...      False   \n",
       "21  http://tunnelview.esri.com:6080/arcgis/rest/se...      False   \n",
       "\n",
       "                             item_id                          item_title  \\\n",
       "13  2b7bfdc9ea45431b8390b5425466dfe2     InsecureWM_World4_ForSelection1   \n",
       "14  2b7bfdc9ea45431b8390b5425466dfe2     InsecureWM_World4_ForSelection1   \n",
       "20  0db3c0cb32574c328ef8f26f667f886f  InsecureWS_World4_ForSelection1_3D   \n",
       "21  0db3c0cb32574c328ef8f26f667f886f  InsecureWS_World4_ForSelection1_3D   \n",
       "\n",
       "                                             item_url  item_type  \n",
       "13  https://datascienceqa.esri.com/portal/home/ite...    Web Map  \n",
       "14  https://datascienceqa.esri.com/portal/home/ite...    Web Map  \n",
       "20  https://datascienceqa.esri.com/portal/home/ite...  Web Scene  \n",
       "21  https://datascienceqa.esri.com/portal/home/ite...  Web Scene  "
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "if df is not None:\n",
    "    display(df.query(\"is_secure == False\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "as well as use any of the other powerful pandas functionality to gain more insight into the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     1e44753d695e4abb860a74fdf091a5f7\n",
       "8     2b7bfdc9ea45431b8390b5425466dfe2\n",
       "15    0db3c0cb32574c328ef8f26f667f886f\n",
       "Name: item_id, dtype: object"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "if df is not None:\n",
    "    display(df.query(\"is_secure == True\")['item_id'].drop_duplicates())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Conclusion\n",
    "\n",
    "This notebook provided the workflow for identifying WebMap/WebScene/App Items that use insecure URLs and placed the results in two output `.csv` files. This notebook can be a powerful administrative tool to help you increase the security of your maps and apps. As the saying goes: \"Security is always excessive until it's not enough\".\n",
    "\n",
    "### Rewrite this Notebook\n",
    "\n",
    "This notebook can be rewritten to solve related problems. One of these problems is to identify WebMaps/WebScenes/Apps that contain services from an old ArcGIS Server that you are planning to turn off. Replace the `is_http()` and `is_https()` functions with something like:\n",
    "\n",
    "```python\n",
    "def is_from_domain(url):\n",
    "    return 'old-arcgis-server-domain.com' in url\n",
    "```\n",
    "\n",
    "You can then use a lot of the remaining functionality of this notebook to check to make sure that your items would not be affected by turning off the old ArcGIS Server.\n",
    "\n",
    "### Related Notebooks\n",
    "\n",
    "For related notebooks, search for the following in your samples notebook gallery:\n",
    "\n",
    "- Check WebMaps for Broken URLs\n"
   ]
  }
 ],
 "metadata": {
  "esriNotebookRuntime": {
   "notebookRuntimeName": "ArcGIS Notebook Python 3 Standard",
   "notebookRuntimeVersion": "10.7.1"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": false,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "**Table of Contents**",
   "title_sidebar": "Contents",
   "toc_cell": true,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}