Skip to content

Commit cbe17d7

Browse files
harenbergsdrasbt
authored andcommitted
Fpmax (#553)
* Add fpmax algorithm to frequent patterns module * Refactor unit tests for frequent patterns * Small fix to fpmax * Add unit tests for fpmax * Fix unit tests for apriori and growth plus more refactoring * Change EOL to match rest of repo (LF instead of CRLF) * Remove unittest parent class from frequent pattern tests as it is unneeded * Improve valid val check performance in frequent patterns * Fix some pep8 issues * Fix pytest issues * Refactor fpgrowth * add boolean array to unit tests * add documentation
1 parent b20e57c commit cbe17d7

15 files changed

+1270
-453
lines changed

docs/mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ nav:
9393
- user_guide/frequent_patterns/apriori.md
9494
- user_guide/frequent_patterns/association_rules.md
9595
- user_guide/frequent_patterns/fpgrowth.md
96+
- user_guide/frequent_patterns/fpmax.md
9697
- general concepts:
9798
- user_guide/general_concepts/activation-functions.md
9899
- user_guide/general_concepts/gradient-optimization.md

docs/sources/CHANGELOG.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -21,8 +21,9 @@ The CHANGELOG for the current development version is available at
2121
- Added optional `groups` parameter to `SequentialFeatureSelector` and `ExhaustiveFeatureSelector` `fit()` methods for forwarding to sklearn CV ([#537](https://github.com/rasbt/mlxtend/pull/537) via [arc12](https://github.com/qiaguhttps://github.com/arc12))
2222
- Added a new `plot_pca_correlation_graph` function to the `mlxtend.plotting` submodule for plotting a PCA correlation graph. ([#544](https://github.com/rasbt/mlxtend/pull/544) via [Gabriel-Azevedo-Ferreira](https://github.com/qiaguhttps://github.com/Gabriel-Azevedo-Ferreira))
2323
- Added a `zoom_factor` parameter to the `mlxten.plotting.plot_decision_region` function that allows users to zoom in and out of the decision region plots. ([#545](https://github.com/rasbt/mlxtend/pull/545))
24-
- Added a function `fpgrowth` that implements the FP-Growth algorithm for mining frequent itemsets as a drop-in replacement of the existing `apriori` algorithm. ([#550](https://github.com/rasbt/mlxtend/pull/550) via [Steve Harenberg](https://github.com/harenbergsd))
24+
- Added a function `fpgrowth` that implements the FP-Growth algorithm for mining frequent itemsets as a drop-in replacement for the existing `apriori` algorithm. ([#550](https://github.com/rasbt/mlxtend/pull/550) via [Steve Harenberg](https://github.com/harenbergsd))
2525
- New `heatmap` function in `mlxtend.plotting`. ([#552](https://github.com/rasbt/mlxtend/pull/552))
26+
- Added a function `fpmax` that implements the FP-Max algorithm for mining maximal itemsets as a drop-in replacement for the `fpgrowth` algorithm. ([#553](https://github.com/rasbt/mlxtend/pull/553) via [Steve Harenberg](https://github.com/harenbergsd))
2627
- New `figsize` parameter for the `plot_decision_regions` function in `mlxtend.plotting`. ([#555](https://github.com/rasbt/mlxtend/pull/555) via [Mirza Hasanbasic](https://github.com/kazyka))
2728

2829
##### Changes

docs/sources/USER_GUIDE_INDEX.md

+1
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@
6464
- [apriori](user_guide/frequent_patterns/apriori.md)
6565
- [association_rules](user_guide/frequent_patterns/association_rules.md)
6666
- [fpgrowth](user_guide/frequent_patterns/fpgrowth.md)
67+
- [fpmax](user_guide/frequent_patterns/fpmax.md)
6768

6869
## `general concepts`
6970
- [activation-functions](user_guide/general_concepts/activation-functions.md)

docs/sources/user_guide/frequent_patterns/apriori.ipynb

+8-22
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,12 @@
4141
"source": [
4242
"## References\n",
4343
"\n",
44-
"[1] Agrawal, Rakesh, and Ramakrishnan Srikant. \"[Fast algorithms for mining association rules](https://www.it.uu.se/edu/course/homepage/infoutv/ht08/vldb94_rj.pdf).\" Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994."
44+
"[1] Agrawal, Rakesh, and Ramakrishnan Srikant. \"[Fast algorithms for mining association rules](https://www.it.uu.se/edu/course/homepage/infoutv/ht08/vldb94_rj.pdf).\" Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994.\n",
45+
"\n",
46+
"## Related\n",
47+
"\n",
48+
"- [FP-Growth](../fpgrowth.md)\n",
49+
"- [FP-Max](../fpmax.md)"
4550
]
4651
},
4752
{
@@ -53,9 +58,7 @@
5358
},
5459
{
5560
"cell_type": "markdown",
56-
"metadata": {
57-
"collapsed": true
58-
},
61+
"metadata": {},
5962
"source": [
6063
"The `apriori` function expects data in a one-hot encoded pandas DataFrame.\n",
6164
"Suppose we have the following transaction data:"
@@ -923,23 +926,6 @@
923926
"name": "stdout",
924927
"output_type": "stream",
925928
"text": [
926-
"\r",
927-
"Iteration: 1 | Sampling itemset size 2\r",
928-
"Iteration: 2 | Sampling itemset size 2\r",
929-
"Iteration: 3 | Sampling itemset size 2\r",
930-
"Iteration: 4 | Sampling itemset size 2\r",
931-
"Iteration: 5 | Sampling itemset size 2\r",
932-
"Iteration: 6 | Sampling itemset size 2\r",
933-
"Iteration: 7 | Sampling itemset size 2\r",
934-
"Iteration: 8 | Sampling itemset size 2\r",
935-
"Iteration: 9 | Sampling itemset size 2\r",
936-
"Iteration: 10 | Sampling itemset size 2\r",
937-
"Iteration: 11 | Sampling itemset size 3\r",
938-
"Iteration: 12 | Sampling itemset size 3\r",
939-
"Iteration: 13 | Sampling itemset size 3\r",
940-
"Iteration: 14 | Sampling itemset size 3\r",
941-
"Iteration: 15 | Sampling itemset size 3\r",
942-
"Iteration: 16 | Sampling itemset size 3\r",
943929
"Iteration: 17 | Sampling itemset size 3\n"
944930
]
945931
},
@@ -1176,5 +1162,5 @@
11761162
}
11771163
},
11781164
"nbformat": 4,
1179-
"nbformat_minor": 1
1165+
"nbformat_minor": 2
11801166
}

docs/sources/user_guide/frequent_patterns/fpgrowth.ipynb

+6-1
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,12 @@
4747
"\n",
4848
"[1] Han, Jiawei, Jian Pei, Yiwen Yin, and Runying Mao. \"Mining frequent patterns without candidate generation. \"[A frequent-pattern tree approach.](https://link.springer.com/content/pdf/10.1023%2FB%3ADAMI.0000005258.31418.83.pdf)\" Data mining and knowledge discovery 8, no. 1 (2004): 53-87.\n",
4949
"\n",
50-
"[2] Agrawal, Rakesh, and Ramakrishnan Srikant. \"[Fast algorithms for mining association rules](https://www.it.uu.se/edu/course/homepage/infoutv/ht08/vldb94_rj.pdf).\" Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994."
50+
"[2] Agrawal, Rakesh, and Ramakrishnan Srikant. \"[Fast algorithms for mining association rules](https://www.it.uu.se/edu/course/homepage/infoutv/ht08/vldb94_rj.pdf).\" Proc. 20th int. conf. very large data bases, VLDB. Vol. 1215. 1994.\n",
51+
"\n",
52+
"## Related\n",
53+
"\n",
54+
"- [FP-Max](../fpmax.md)\n",
55+
"- [Apriori](../apriori.md)"
5156
]
5257
},
5358
{

0 commit comments

Comments
 (0)